deephaven / deephaven-core

Deephaven Community Core
Other
252 stars 80 forks source link

Try removing hadoop-common dependencies from parquet #5517

Open malhotrashivam opened 4 months ago

malhotrashivam commented 4 months ago

In the latest v2.0 release of parquet-mr (issue PARQUET-1822), they have added a number of wrapper classes which should allow users to use parquet-hadoop without depending on hadoop-common. We should work with these new wrappers to avoid the dependency in our code. Note that parquet-hadoop might still internally use hadoop-common though.

Found during #5469

malhotrashivam commented 4 months ago

Some notes: Important PR : https://github.com/apache/parquet-mr/pull/1141/files#diff-b044ae9879a94e2b8a49d6e6911ea5498ef162df1373cc049ded6256980a7248

One interesting class they have now added are org.apache.parquet.conf.PlainParquetConfiguration to replace org.apache.hadoop.conf.Configuration.

Some other interesting classes: