GoogleCloudPlatform / professional-services-data-validator

Utility to compare data between homogeneous or heterogeneous environments to ensure source and target tables match
Apache License 2.0
408 stars 119 forks source link

Consider retry in BigQuery connect code #1313

Open nj1973 opened 2 weeks ago

nj1973 commented 2 weeks ago

When running integration tests we occasionally have failures due to transient errors connecting to BigQuery:

Example:

"Step #13 - "integration_db2": data_validation/clients.py:282: "
"Step #13 - "integration_db2": data_validation/clients.py:105: in get_bigquery_client"
"Step #13 - "integration_db2":     ibis_client = ibis.bigquery.connect("
"Step #13 - "integration_db2": .nox/integration_db2/lib/python3.9/site-packages/ibis/__init__.py:97: in connect"
"Step #13 - "integration_db2":     return backend.connect(*args, **kwargs)"
"Step #13 - "integration_db2": .nox/integration_db2/lib/python3.9/site-packages/ibis/backends/base/__init__.py:530: in connect"
"Step #13 - "integration_db2":     new_backend.reconnect()"
"Step #13 - "integration_db2": .nox/integration_db2/lib/python3.9/site-packages/ibis/backends/base/__init__.py:545: in reconnect"
"Step #13 - "integration_db2":     self.do_connect(*self._con_args, **self._con_kwargs)"
"Step #13 - "integration_db2": .nox/integration_db2/lib/python3.9/site-packages/ibis/backends/bigquery/__init__.py:163: in do_connect"
"Step #13 - "integration_db2":     credentials, default_project_id = pydata_google_auth.default("
"Step #13 - "integration_db2": .nox/integration_db2/lib/python3.9/site-packages/pydata_google_auth/auth.py:152: in default"
"Step #13 - "integration_db2":     credentials = get_user_credentials("
"Step #13 - "integration_db2": .nox/integration_db2/lib/python3.9/site-packages/pydata_google_auth/auth.py:362: in get_user_credentials"
"Step #13 - "integration_db2":     credentials = _webserver.run_local_server(app_flow, **AUTH_URI_KWARGS)"
"Step #13 - "integration_db2": .nox/integration_db2/lib/python3.9/site-packages/pydata_google_auth/_webserver.py:89: in run_local_server"
"Step #13 - "integration_db2":     return app_flow.run_local_server(host=LOCALHOST, port=port, **kwargs)"
"Step #13 - "integration_db2": .nox/integration_db2/lib/python3.9/site-packages/google_auth_oauthlib/flow.py:447: in run_local_server"
"Step #13 - "integration_db2":     webbrowser.get(browser).open(auth_url, new=1, autoraise=True)"
...
"Step #13 - "integration_db2": >       raise Error("could not locate runnable browser")"
"Step #13 - "integration_db2": E       webbrowser.Error: could not locate runnable browser"
"Step #13 - "integration_db2": /usr/local/lib/python3.9/webbrowser.py:65: Error"

While transient errors during tests is only a small irritation we can probably assume that intense users of DVT on BigQuery could also run into these errors.

We could consider adding a single retry in get_bigquery_client() in data_validation/clients.py.

We could investigate from google.api_core import retry as a property to auto retry (a single time) get_bigquery_client() if it throws with webbrowser.Error.

I've not fully researched all retry attributes but I believe this would do the trick in a clean way.