apache / orc

Apache ORC - the smallest, fastest columnar storage for Hadoop workloads
https://orc.apache.org/
Apache License 2.0
689 stars 483 forks source link

ORC-1577: Use `ZSTD` as the default compression #1733

Closed dongjoon-hyun closed 10 months ago

dongjoon-hyun commented 10 months ago

What changes were proposed in this pull request?

This PR aims to use ZSTD as the default compression from Apache ORC 2.0.0.

Why are the changes needed?

Apache ORC has been supporting ZStandard since 1.6.0.

ZStandard is known to be better than Gzip in terms of the size and speed.

How was this patch tested?

Pass the CIs.

dongjoon-hyun commented 10 months ago

cc @williamhyun , @wgtmac , @guiyanakuang , @deshanxiao

dongjoon-hyun commented 10 months ago

BTW, did you choose your Apache ID, @deshanxiao ? 😄

deshanxiao commented 10 months ago

BTW, did you choose your Apache ID, @deshanxiao ? 😄

Yes, and I have replied to that email also, my Apache ID is deshanxiao. Thank you @dongjoon-hyun

dongjoon-hyun commented 10 months ago

Got it!

dongjoon-hyun commented 10 months ago

It seems that the ID is not created yet.

Please ping me when your ID is created, @deshanxiao . I can help you the community-side setup.

deshanxiao commented 10 months ago

Sure, thank you @dongjoon-hyun

dongjoon-hyun commented 10 months ago

To @deshanxiao , note that you need to include Craig L Russell who requested the ID creation. Could you double-check your last reply email includes him (or secratary)?

deshanxiao commented 10 months ago

Thanks for the reminder, I checked and it was not included, so I sent another email. @dongjoon-hyun