skale-me / node-parquet

NodeJS module to access apache parquet format files
Apache License 2.0
57 stars 11 forks source link

Integers converted to undefined #32

Closed rafiton closed 6 years ago

rafiton commented 7 years ago

Hi Mark, I started getting strange results when converting to Parquet and back using your module and your example:

Here is the code I'm using:

var parquet = require('node-parquet');

var schema = {
    small_int: {type: 'int32'},
    big_int: {type: 'int64'},
    name: {type: 'byte_array'}
};

var data = [
    [ 13, 1111, 'hello world r'],
    [ 2, 2234, 'hello world 1'],
    [ 3, 2334, 'hello world 2'],
    [ 4, 1223, 'hello world 3']
];

var writer = new parquet.ParquetWriter('/tmp/my_file.parquet', schema);
writer.write(data);
writer.close();

And this is the code I'm reading the Parquet file:

var fs = require('fs');
var parquet = require('node-parquet');

var file = '/tmp/my_file.parquet';

var reader = new parquet.ParquetReader(file);
console.log(reader.info());
console.log(reader.rows());
reader.close();

And this is the result I'm getting:

{ version: 0, createdBy: 'parquet-cpp version 1.0.0', rowGroups: 1, columns: 3, rows: 4 } [ [ undefined, 1111, 'hello world r' ], [ 2, 2234, 'hello world 1' ], [ 3, 2334, 'hello world 2' ], [ 4, 1223, 'hello world 3' ] ]

As you can see the number 13 is shown as undefined. If I add a more complex schema more integers are shown as undefined.

I'm running AWS Linux, Node 8.2.1

Any idea?

rafiton commented 7 years ago

Anything?

mvertes commented 6 years ago

Hi @rafiton, I'm really back from holidays now. Could you check again with the head version (after merge of #37) which hopefully should fix the issue. Note: I could not reproduce exactly your pb, but I saw (not systematically) other errors at reading. My diagnostic is that some output variables were not always set by parquet-cpp at reading (depending on schema, etc), and if not initialized, could cause problems. I forced proper initialization.

rafiton commented 6 years ago

Hi @mvertes Looks fine now - thanks for the help.