And have observed that many tcp connections are being opened (millions in a few hours).
From observing the code itself, it seems like BigQueryHelper uses the default HttpTransportOptions and thus the default NetHttpTransport is being built and used. DefaultConnectionFactory is used internally - which opens a new connection for each URL given.
The number of new connections opened seems to have been reduced once we upgraded from version 1.6.4 to 2.1.4 (not sure why), but it still seems like it can be improved by introducing connection pools.
We have the following setup in production:
And have observed that many tcp connections are being opened (millions in a few hours).
From observing the code itself, it seems like
BigQueryHelper
uses the defaultHttpTransportOptions
and thus the defaultNetHttpTransport
is being built and used.DefaultConnectionFactory
is used internally - which opens a new connection for each URL given.We would like to be able to configure connection pooling/reusing without needing to do code changes . An example of how to do this can be found here: https://github.com/googleapis/google-cloud-java/issues/6444