skale-me / node-parquet

NodeJS module to access apache parquet format files
Apache License 2.0
57 stars 11 forks source link

memory leak in ParquetWriter #46

Closed david-wb closed 4 years ago

david-wb commented 6 years ago

Hi,

I'm trying to use the ParquetWriter to write to a parquet file one line at a time. I noticed that node memory usage continues to grow with each call writer.write(rows). I am processing a very large file and the memory usage grows beyond my machines limits. Since I am reading and writing one row at a time, it seems like the memory usage should stay constant. Is there a workaround for this?

Thanks, David

mvertes commented 6 years ago

Thanks for the report. I'm searching this issue, and look if I'm using correctly the parquet-cpp layer, which performs as well some memory caching and allocation on its side.