Open balajiaruna opened 5 years ago
Appending to a parquet file is a little more complicated than specifying append flag on the file, as the file has a metadata and footer at the end of the file.
One way to do a pure append, is first read the metadata and then append manually to the file, and finalize by writing the updated metadata and the footer at the end. The old metadata would be essentially orphaned off.
FWIW: I just grouped all the rows I needed for a particular parquet file into a custom data structure. Once built, I looped through that structure and appended to the parquet file within a single open/close block. Solved the problem of having to worry about appending via the parquetJS api.
I have specified the option of append mode, but fruits.parquet has only the first 2 row (apples & Oranges). What am I missing?
Thanks!
var opts = {flags: 'a'};
var writer = await parquet.ParquetWriter.openFile(schema, 'fruits.parquet', opts);
// append a few rows to the file await writer.appendRow({name: 'apples', quantity: 10, price: 2.5, date: new Date(), in_stock: true}); await writer.appendRow({name: 'oranges', quantity: 10, price: 2.5, date: new Date(), in_stock: true}); write.close();
writer = await parquet.ParquetWriter.openFile(schema, 'fruits.parquet', opts);
// append a few rows to the file await writer.appendRow({name: 'banana', quantity: 10, price: 2.5, date: new Date(), in_stock: true}); await writer.appendRow({name: 'peaches', quantity: 10, price: 2.5, date: new Date(), in_stock: true}); write.close();