apache / parquet-java

Apache Parquet Java
https://parquet.apache.org/
Apache License 2.0
2.65k stars 1.41k forks source link

Make ParquetFileReader extensible #3006

Open kenwenzel opened 2 months ago

kenwenzel commented 2 months ago

Describe the enhancement requested

I wanted to add a cache for ParquetFileReader.getColumnIndexStore(int blockIndex) and I've noticed that I can't access the required blocks field from a subclass. To accomplish this I had to fallback to reflection as can be seen here: https://github.com/linkedfactory/linkedfactory-pod/blob/c67b155e3fbf76082e2b24fd412efae5838badc6/bundles/io.github.linkedfactory.core/src/main/java/io/github/linkedfactory/core/kvin/parquet/KvinParquet.java#L881

I would propose to improve the extensibility of ParquetFileReader by changing certain core members (like blocks) to protected or by exposing them through protected getters.

Component(s)

No response

wgtmac commented 2 months ago

It doesn't make a big difference to change private members to protected or public as we enforce backward compatibility for all public APIs. BTW, is it possible to introduce similar approach like https://github.com/apache/parquet-java/pull/1174