lquerel / gcp-bigquery-client

GCP BigQuery Client (Rust)
Apache License 2.0
92 stars 60 forks source link

Support for GZIP compression #74

Open Deniskore opened 5 months ago

Deniskore commented 5 months ago

Hey @lquerel! I've noticed that, for some reason, GZIP is not enabled for outgoing request body. Based on my data, enabling GZIP compression for request body results in faster transfer speeds. I want to contribute a small pull request to implement it.

I see two ways to implement this: 1) Adding a custom feature, GZIP. 2) Adding a parameter with the type of enum to the function TableDataApi::insert_all, which indicates the compression algorithm.

The same data is sent three times with a max batch size of 50_000.

Without GZIP:

Inserting 52511 rows to ***, geographic location europe-west2
Inserted 52511 rows to *** in 13.08 seconds

Inserting 52511 rows to ***, geographic location europe-west2
Inserted 52511 rows to *** in 12.93 seconds

Inserting 52511 rows to ***, geographic location europe-west2
Inserted 52511 rows to *** in 12.56 seconds

With GZIP:

Inserting 52511 rows to ***, geographic location europe-west2
Inserted 52511 rows to *** in 7.49 seconds

Inserting 52511 rows to ***, geographic location europe-west2
Inserted 52511 rows to *** in 7.57 seconds

Inserting 52511 rows to ***, geographic location europe-west2
Inserted 52511 rows to *** in 7.26 seconds
fungiboletus commented 1 month ago

Here is the diff if like me, you are looking for it: https://github.com/lquerel/gcp-bigquery-client/compare/main...Deniskore:gcp-bigquery-client:feature_gzip

Deniskore commented 1 month ago

@fungiboletus Since I haven't heard back from the author, I decided not to proceed with a PR for now. However, if the author does reach out in the future, I'll be more than happy to make the necessary changes.

lquerel commented 1 month ago

Sorry @Deniskore for the delay. Could you create a PR with what you propose, and I will take care of publishing an update on crates.io. Thank you very much.

Deniskore commented 2 weeks ago

Hey @lquerel, @fungiboletus I've created PRs for two different implementations. Let's decide which variant is best for us.