ironSource / parquetjs

fully asynchronous, pure JavaScript implementation of the Parquet file format
MIT License
345 stars 173 forks source link

Allow zero-row files, and handle stream exceptions #106

Closed gregplaysguitar closed 3 years ago

gregplaysguitar commented 4 years ago

Creating zero-row files is uncommon, but sometimes useful for handling edge cases in a pipeline, so there's no need to arbitrarily prevent it.

The ParquetTransformer change allows exceptions to be caught and dealt with upstream when dealing with streams, previously they were being swallowed.

asmuth commented 4 years ago

The first change (allow zero-row files) looks good to me.

Can't say much about the second change, since I'm not an expert on JavaScript streams, but I took a quick look at the nodejs docs. Based on that, the proposed change looks correct to me and what we currently have seems to be incorrect. Maybe @kessler can also comment on the streams issue.

annfomenko commented 3 years ago

Hello @asmuth! This PR looks exactly that I need [https://github.com/ironSource/parquetjs/issues/116]. When do you plan to merge it?

dobesv commented 3 years ago

PR looks good in general. Unfortunately I cannot merge it and the maintainers of this project don't seem to be merging anything for more than a year (or two?)

kessler commented 3 years ago

@dobesv very sorry my friend. I am well aware that this project has been neglected and I really really hope I get to do something about it some time. I went over the PR and more importantly Paul did, so it's merged and I'll publish to npm asap. My sincere apologies again.

dobesv commented 3 years ago

@kessler It's OK, it's the nature of volunteer work - sometimes we just have better things to do!