z3z1ma / dbt-osmosis

Provides automated YAML management, a dbt server, streamlit workbench, and git-integrated dbt model output diff tools
https://z3z1ma.github.io/dbt-osmosis/
Apache License 2.0
422 stars 45 forks source link

Failed to establish a new connection after a while with BigQuery #110

Closed yu-iskw closed 1 month ago

yu-iskw commented 9 months ago

If I apply dbt-osmosis yaml refactor to a log of dbt models, I got the subsequent error after a while consecutively. I am assuming the issue was caused by the timeout of authentication, though I haven't looked into the implementation. How can sove the issue?

ERROR    Error occurred while processing model                                                                                    osmosis.py:931
         model.xxx.xxxxxxxxxxxxxxxxxxxx: Deadline of 600.0s exceeded
         while calling target function, last exception: HTTPSConnectionPool(host='bigquery.googleapis.com', port=443): Max
         retries exceeded with url:
         /bigquery/v2/projects/xxxxxx/datasets/if_xxxx/tables/xxxxxxxxxxxxxxxxxxxxxx?pret
         tyPrint=false (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x14198e950>: Failed to
         establish a new connection: [Errno 8] nodename nor servname provided, or not known'))
syou6162 commented 9 months ago

I encountered a similar error for a very large dbt project (1000+ models). When I want to run dbt-osmosis yaml refactor in these cases, I do the following.

yu-iskw commented 9 months ago

Thank you for the comment. I already tried the approach. So, it would be good to how to resolve it on the dbt-osmosis side rather than the workaround.

yu-iskw commented 9 months ago

I'm looking into https://github.com/googleapis/google-auth-library-python/issues/1356 , as the issue looks similar to this issue. And I'm doubting there is any conflicts between the multithreading in dbt-osmosis and the Google Cloud packages in python.

z3z1ma commented 1 month ago

Upstream issue closed. I also added a small change here https://github.com/z3z1ma/dbt-osmosis/commit/a1c21093c06713fb5f5486a3e3257c7ac5153857 too that while I am not 100% sure would solve this, I think could be a conflating factor. We have an adapter connection invalidation/refresh process because DbtCoreInterface (our thin interface layer that keeps us 1 layer abstracted from dbt core) was designed to be used in a long running Service like a proxy server or custom LSP. But dbt-osmosis is just a typical process which will saturate the connection pool then spin down.