datacommonsorg / website

Code for the Data Commons website
https://datacommons.org
Apache License 2.0
24 stars 83 forks source link

custom_dc: Lost connection to MySQL server during query #4031

Open bkosuru opened 8 months ago

bkosuru commented 8 months ago

Created MySQL in GCP as described in https://github.com/datacommonsorg/website/tree/master/custom_dc#setup-google-cloud-sql Also, specified the flags

net_read_timeout 300 net_write_timeout 300 wait_timeout 120 connect_timeout 300 interactive_timeout 300 max_connections 1000

Datacommons is able to connect but I am getting the error when I try to load the sample data - 'Lost connection to MySQL server during query'

I0313 12:24:40.040244 140711787871104 filehandler.py:118] Using GCS project: myproject Using GCS project: myproject I0313 12:24:40.870187 140711787871104 runner.py:108] Using Cloud SQL settings from env. Using Cloud SQL settings from env. I0313 12:24:40.873133 140711787871104 db.py:317] Connecting to Cloud MySQL: myproject:us-central1:dc-graph (datacommons) Connecting to Cloud MySQL: myproject:us-central1:dc-graph (datacommons) Traceback (most recent call last): File "", line 198, in _run_module_as_main File "", line 88, in _run_code File "/workspace/import/simple/stats/main.py", line 82, in app.run(main) File "/usr/local/lib/python3.11/site-packages/absl/app.py", line 308, in run _run_main(main, args) File "/usr/local/lib/python3.11/site-packages/absl/app.py", line 254, in _run_main sys.exit(main(argv)) ^^^^^^^^^^ File "/workspace/import/simple/stats/main.py", line 78, in main _run() File "/workspace/import/simple/stats/main.py", line 64, in _run Runner(config_file=FLAGS.config_file, File "/workspace/import/simple/stats/runner.py", line 118, in init self.db = create_db(_get_db_config()) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/workspace/import/simple/stats/db.py", line 373, in create_db return SqlDb(config) ^^^^^^^^^^^^^ File "/workspace/import/simple/stats/db.py", line 195, in init self.engine = create_db_engine(config) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/workspace/import/simple/stats/db.py", line 362, in create_db_engine return CloudSqlDbEngine(db_params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/workspace/import/simple/stats/db.py", line 319, in init self.connection: Connection = connector.connect( ^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/google/cloud/sql/connector/connector.py", line 163, in connect return connect_task.result() ^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 456, in result return self.get_result() ^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 401, in get_result raise self._exception File "/usr/local/lib/python3.11/site-packages/google/cloud/sql/connector/connector.py", line 279, in connect_async return await self._loop.run_in_executor(None, connect_partial) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/concurrent/futures/thread.py", line 58, in run result = self.fn(*self.args, **self.kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/google/cloud/sql/connector/pymysql.py", line 62, in connect conn.connect(sock) File "/usr/local/lib/python3.11/site-packages/pymysql/connections.py", line 692, in connect self.autocommit(self.autocommit_mode) File "/usr/local/lib/python3.11/site-packages/pymysql/connections.py", line 442, in autocommit self._send_autocommit_mode() File "/usr/local/lib/python3.11/site-packages/pymysql/connections.py", line 463, in _send_autocommit_mode self._read_ok_packet() File "/usr/local/lib/python3.11/site-packages/pymysql/connections.py", line 448, in _read_ok_packet pkt = self._read_packet() ^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/pymysql/connections.py", line 739, in _read_packet packet_header = self._read_bytes(4) ^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/pymysql/connections.py", line 795, in _read_bytes raise err.OperationalError( pymysql.err.OperationalError: (2013, 'Lost connection to MySQL server during query')

keyurva commented 7 months ago

Hi @bkosuru - can you tell us a bit more about your import and how long after you initiate the load process do you see this error?

Note that when loading data, we open a single connection to insert data into the database. With that in mind, a couple of things to try:

bkosuru commented 7 months ago

Hi @keyurva, I am only trying to load the sample data provided for custom_dc. It is a very small dataset. The error happens right away within a second. Do I need to set any other flags or change current settings? Thanks!

keyurva commented 7 months ago

Actually, looking at the logs closely, it seems like it was never able to connect to the database. Can you ensure your connection parameters are correct?

bkosuru commented 7 months ago

Sometimes it connects and it created 2 tables - triples and observations.

I0311 16:47:51.725041 139931865762688 db.py:317] Connecting to Cloud MySQL: myproject:us-central1:dc-graph (datacommons) Connecting to Cloud MySQL: myproject:us-central1:dc-graph (datacommons) I0311 16:47:52.873222 139931865762688 db.py:321] Connected to Cloud MySQL: myproject:us-central1:dc-graph (datacommons) Connected to Cloud MySQL: myproject:us-central1:dc-graph (datacommons)

keyurva commented 7 months ago

It seems to be a DB connectivity issue. Based on a quick search for the error ("Lost connection to MySQL server during query"), you may need to tweak certain timeouts.