frictionlessdata / datapackage-go

A Go library for working with Data Package.
MIT License
21 stars 4 forks source link

Datapackage doesn't validate if parsing a large integer (eg: bytes) #28

Closed cschloer closed 2 years ago

cschloer commented 2 years ago

Overview

JSON doesn't differentiate between floats and integers. When parsed from JSON, large integers get converted into scientific notation (https://stackoverflow.com/questions/22343083/json-unmarshaling-with-long-numbers-gives-floating-point-number), for example:

17747417 -> 1.7747417e+07

If you try to create a datapackage that contains a resource with a large number of bytes using the FromString method, it errors with:

Error: I[#] S[#] doesn't validate with "data-package#" I[#/resources/0/bytes] S[#/properties/resources/items/properties/bytes/type] expected integer, but got number 

Here's some basic code that should make it happen:

dpackage, err = datapackage.FromString(dpString, ".")
fmt.Println("GOT ERROR HERE", err)

Using datapackage string:

{
  "created": "2021-11-25T10:11:24Z",
  "name": "new_dataset",
  "profile": "data-package",
  "resources": [
    {
      "bytes": 17747417,
      "created": "Thursday, 25-Nov-21 11:09:13 UTC",
      "description": "",
      "filename": "excel_no_spaces_digit_sheets.xlsx",
      "modified": "Thursday, 25-Nov-21 11:09:13 UTC",
      "name": "732920043605108807",
      "path": "https://127.0.0.1:9000/minio/bcodmo-submissions-staging/5711471826818791508/files/excel_no_spaces_digit_sheets.xlsx"
    }
  ],
  "title": "New dataset",
  "updated": "2021-11-25T11:09:13Z"
}

I know this repo hasn't seen activity in a while, but this is a pretty serious bug. At best it means you can't use the bytes key (my workaround will be to use a size key instead), at worst it creates a hidden bug if you only test your code on small files.


Please preserve this line to notify @danielfireman (lead of this repository) @roll

danielfireman commented 2 years ago

Hi @cschloer, thanks a lot for reporting! I'll take a look at this problem this week.

danielfireman commented 2 years ago

@cschloer, this is done. Could you please try it out? I am going to wait a few days before push a new version.