frictionlessdata / datapackage

Data Package is a standard consisting of a set of simple yet extensible specifications to describe datasets, data files and tabular data. It is a data definition language (DDL) and data API that facilitates findability, accessibility, interoperability, and reusability (FAIR) of data.
https://datapackage.org
The Unlicense
497 stars 113 forks source link

Special number values in TableSchema spec contradict JSON spec #657

Closed iSnow closed 10 months ago

iSnow commented 4 years ago

While the TableSchema spec seems mostly written for CSV tabular data and JSON arrays of JSON objects as an encoding added later, the specification for a number field states:

The following special string values are permitted (case need not be respected):

NaN: not a number
INF: positive infinity
-INF: negative infinity

In contrast, the ECMA JSON specification states:

 Numeric  values  that  cannot  be  represented  as  sequences  of  digits  (such  as Infinity and NaN)  are  not permitted

and the RFC 7159 also contains this statement.

Strictly speaking, this is unsalvageable. As a workaround, we could define that the corresponding string literals can be used, so the following CSV

id, amount
1, 100
2, NaN

would be represented in JSON as

[
  {
    "id": 1,
    "amount": 100
  },
  {
    "id": 2,
    "amount": "NaN"
  }
]
roll commented 10 months ago

Basically, the described workaround is what Table Schema officially requires to do - https://datapackage.org/specifications/table-schema/#physical-and-logical-representation

If a number field cell is represented by a string it needs to be processed, including substituting special string values. So that will be a correct way to do so as mentioned:

[
  {
    "id": 1,
    "amount": 100
  },
  {
    "id": 2,
    "amount": "NaN"
  }
]