LibertyDSNP / parquetjs

Fully asynchronous, pure JavaScript implementation of the Parquet file format with additional features
MIT License
43 stars 24 forks source link

Use the Double primative for JSON Schema "number" type #111

Closed mpotter closed 5 months ago

mpotter commented 5 months ago

Problem

The JSON Schema stipulates that a "number" type can be an integer, floating point, or exponential notation. Currently, the "number" type is treated the same as an integer when instantiating fromJSONSchema. In those cases, providing a float value of e.g. 2.5 will fail because it can't be converted into a BigInt.

Solution

I changed the primitive to Double for "number" types when creating a schema from JSON.

Change summary:

Steps to Verify:

Note: This is the example from the README.

import * as parquet from "@dsnp/parquetjs";

var schema = parquet.ParquetSchema.fromJsonSchema({
  "type": "object",
  "properties": {
    "name": {
      "type": "string"
    },
    "quantity": {
      "type": "integer"
    },
    "price": {
      "type": "number"
    },
    "date": {
      "type": "string",
      "format": "date-time"
    },
    "in_stock": {
      "type": "boolean"
    }
  },
  "required": ["name", "quantity", "price", "date", "in_stock"]
});

var writer = await parquet.ParquetWriter.openFile(schema, 'fruits.parquet');

await writer.appendRow({name: 'apples', quantity: 10, price: 2.5, date: new Date(), in_stock: true});
await writer.appendRow({name: 'oranges', quantity: 10, price: 2.5, date: new Date(), in_stock: true});

await writer.close();

Expected behavior is that the file will be successfully written. Current behavior results in RangeError: The number 2.5 cannot be converted to a BigInt because it is not an integer.

noxify commented 5 months ago

LGTM

noxify commented 5 months ago

should we update the tests, too?

Edit: We have to - CI tests are failing 🙈

mpotter commented 5 months ago

@wilwade tests updated

wilwade commented 5 months ago

Published as part of v1.4.0! Thanks again!

mpotter commented 5 months ago

Awesome, thank you for the fast turnaround 🙏