confluentinc / kafka-connect-bigquery

A Kafka Connect BigQuery sink connector
Apache License 2.0
3 stars 1 forks source link

BigQuery connection pooling/reusing is not supported #124

Open kshlomi opened 3 years ago

kshlomi commented 3 years ago

We have the following setup in production:

Kafka -> kafka-connect -> proxy -> BigQuery
                       ^
                       | monitoring connections here

And have observed that many tcp connections are being opened (millions in a few hours).

From observing the code itself, it seems like BigQueryHelper uses the default HttpTransportOptions and thus the default NetHttpTransport is being built and used. DefaultConnectionFactory is used internally - which opens a new connection for each URL given.

We would like to be able to configure connection pooling/reusing without needing to do code changes . An example of how to do this can be found here: https://github.com/googleapis/google-cloud-java/issues/6444

kshlomi commented 3 years ago

The number of new connections opened seems to have been reduced once we upgraded from version 1.6.4 to 2.1.4 (not sure why), but it still seems like it can be improved by introducing connection pools.