ironSource / parquetjs

fully asynchronous, pure JavaScript implementation of the Parquet file format
MIT License
349 stars 176 forks source link

Optional variables can not be explicitly `undefined` or `null` #36

Closed ZJONSSON closed 6 years ago

ZJONSSON commented 6 years ago

When streaming from a database or csv, missing values are often in the Object.keys of the record with the value either set to undefined or null

Here is an example that fails on colour (which is optional) being undefined in the second record.

const parquet = require('./parquet.js');

var schema = new parquet.ParquetSchema({
  name: { type: 'UTF8' },
  colour: {type: 'UTF8', optional: true}
});

async function main() {
  const writer = await parquet.ParquetWriter.openFile(schema, 'fruits.parquet');
  await writer.appendRow({name: 'banana', colour: 'yellow'});
  await writer.appendRow({name: 'apple', colour: undefined});
  await writer.close();
}

main()
  .then(() => console.log('done'))
  .catch(e => console.log(e));

resulting in the following error:

TypeError: Cannot read property 'constructor' of undefined
    at shredRecordInternal (/home/zjonsson/git/parquetjs/lib/shred.js:83:29)
    at Object.exports.shredRecord (/home/zjonsson/git/parquetjs/lib/shred.js:40:3)
    at ParquetWriter.appendRow (/home/zjonsson/git/parquetjs/lib/writer.js:94:22)
    at main (/home/zjonsson/git/parquetjs/test.js:11:16)
    at <anonymous>