Open malhotrashivam opened 2 months ago
One approach that @rcaudy suggested in the meanwhile:
If you have a raw source table in groovy, you should be able to:
It may be useful to write a little standalone utility to print out the FileMetaData as JSON; I've found this little script helpful:
try (final TMemoryBuffer buffer = new TMemoryBuffer(128)) {
fileMetaData.write(new TSimpleJSONProtocol(buffer));
buffer.flush();
System.out.println(buffer.toString(StandardCharsets.UTF_8));
} catch (TException e) {
// ignore
}
This will help with remotely debugging and understanding the parquet file structure. We can follow the similar API spec as duck_db: https://duckdb.org/docs/data/parquet/overview