I'm trying to use the ParquetWriter to write to a parquet file one line at a time. I noticed that node memory usage continues to grow with each call writer.write(rows). I am processing a very large file and the memory usage grows beyond my machines limits. Since I am reading and writing one row at a time, it seems like the memory usage should stay constant. Is there a workaround for this?
Thanks for the report. I'm searching this issue, and look if I'm using correctly the parquet-cpp layer, which performs as well some memory caching and allocation on its side.
Hi,
I'm trying to use the ParquetWriter to write to a parquet file one line at a time. I noticed that node memory usage continues to grow with each call
writer.write(rows)
. I am processing a very large file and the memory usage grows beyond my machines limits. Since I am reading and writing one row at a time, it seems like the memory usage should stay constant. Is there a workaround for this?Thanks, David