apache / arrow

Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics
https://arrow.apache.org/
Apache License 2.0
14.56k stars 3.54k forks source link

[C++][Parquet] parquet::arrow FileReader and FileReaderBuilder might multiple different memory pool #38320

Open mapleFU opened 1 year ago

mapleFU commented 1 year ago

Describe the enhancement requested

class PARQUET_EXPORT FileReader {
 public:
  /// Factory function to create a FileReader from a ParquetFileReader and properties
  static ::arrow::Status Make(::arrow::MemoryPool* pool,
                              std::unique_ptr<ParquetFileReader> reader,
                              const ArrowReaderProperties& properties,
                              std::unique_ptr<FileReader>* out);

  /// Factory function to create a FileReader from a ParquetFileReader
  static ::arrow::Status Make(::arrow::MemoryPool* pool,
                              std::unique_ptr<ParquetFileReader> reader,
                              std::unique_ptr<FileReader>* out);

Here:

  1. ParquetFileReader uses it's memory using pool in ReaderProperties
  2. FileReader has a MemoryPool for building arrow.

So this distinct the parquet arrow reader to possible two memory pool. Is this expected?

Component(s)

C++, Parquet

mapleFU commented 1 year ago

@pitrou @jorisvandenbossche