ironSource / parquetjs

fully asynchronous, pure JavaScript implementation of the Parquet file format
MIT License
349 stars 176 forks source link

Error when field has no values #24

Closed witq closed 6 years ago

witq commented 6 years ago

When an optional field has 0 values, the generated file seems unreadable by parquet-tools, the error I get is can not read class org.apache.parquet.format.PageHeader: Required field 'num_nulls' was not found in serialized data! Struct: DataPageHeaderV2(num_values:10, num_nulls:0, num_rows:10, encoding:PLAIN, definition_levels_byte_length:20, repetition_levels_byte_length:0). When I forced the num_nulls value to be no less than 1, it started working. I'm not sure if this is an issue with the module or with the way I'm trying to use it, so I'm letting you know.

asmuth commented 6 years ago

uh oh. yes, this is a problem with the thriftjs library that we are already working around in a number of places (the problem is that if the value is a literal >0< thriftjs will not acutally encode the field but leave the tag out instead).

will add a regression case and fix for this.

asmuth commented 6 years ago

github automatically closed this issue because I pushed 86682e6e66222b73de9467362f993c89b58ece0c. the fix is in 0.8.0