cloudfuse-io / buzz-rust

Serverless query engine
MIT License
140 stars 11 forks source link

Improve way parquet metadata size is handled #9

Open rdettai opened 3 years ago

rdettai commented 3 years ago

The parquet metadata size is not known from the catalog. Having a dedicated call to the footer containing the metadata size would also be quite inefficient. This is why currently the first call downloads 1MB at the end of the file and hopes that the entire metadata will be within this range:

Solutions might be: