related-sciences / ensembl-genes

Extract the Ensembl genes catalog to simple tables
Other
17 stars 4 forks source link

Retry query when MySQL connection is lost #16

Open dhimmel opened 2 years ago

dhimmel commented 2 years ago

Got the following error in this extraction:

Traceback (most recent call last):
  File "/home/runner/.cache/pypoetry/virtualenvs/ensembl-genes-GU6ps7Hy-py3.9/lib/python3.9/site-packages/mysql/connector/connection_cext.py", line 523, in cmd_query
    self._cmysql.query(query,
_mysql_connector.MySQLInterfaceError: Lost connection to MySQL server during query

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/runner/.cache/pypoetry/virtualenvs/ensembl-genes-GU6ps7Hy-py3.9/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1802, in _execute_context
    self.dialect.do_execute(
  File "/home/runner/.cache/pypoetry/virtualenvs/ensembl-genes-GU6ps7Hy-py3.9/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 732, in do_execute
    cursor.execute(statement, parameters)
  File "/home/runner/.cache/pypoetry/virtualenvs/ensembl-genes-GU6ps7Hy-py3.9/lib/python3.9/site-packages/mysql/connector/cursor_cext.py", line 269, in execute
    result = self._cnx.cmd_query(stmt, raw=self._raw,
  File "/home/runner/.cache/pypoetry/virtualenvs/ensembl-genes-GU6ps7Hy-py3.9/lib/python3.9/site-packages/mysql/connector/connection_cext.py", line 528, in cmd_query
    raise errors.get_mysql_exception(exc.errno, msg=exc.msg,
mysql.connector.errors.OperationalError: 2013 (HY000): Lost connection to MySQL server during query

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/runner/work/ensembl-genes/ensembl-genes/ensembl_genes/ensembl_genes.py", line 611, in command
    fire.Fire(commands)
  File "/home/runner/.cache/pypoetry/virtualenvs/ensembl-genes-GU6ps7Hy-py3.9/lib/python3.9/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/home/runner/.cache/pypoetry/virtualenvs/ensembl-genes-GU6ps7Hy-py3.9/lib/python3.9/site-packages/fire/core.py", line 466, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/home/runner/.cache/pypoetry/virtualenvs/ensembl-genes-GU6ps7Hy-py3.9/lib/python3.9/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/home/runner/work/ensembl-genes/ensembl-genes/ensembl_genes/ensembl_genes.py", line 583, in export_all
    cls.export_datasets(species=species, release=release)
  File "/home/runner/work/ensembl-genes/ensembl-genes/ensembl_genes/ensembl_genes.py", line 571, in export_datasets
    ensgc.export_datasets()
  File "/home/runner/work/ensembl-genes/ensembl-genes/ensembl_genes/ensembl_genes.py", line 507, in export_datasets
    self.write_dataset(export)
  File "/home/runner/work/ensembl-genes/ensembl-genes/ensembl_genes/ensembl_genes.py", line 510, in write_dataset
    df = getattr(self, export.query_fxn)
  File "/opt/hostedtoolcache/Python/3.9.10/x64/lib/python3.9/functools.py", line 993, in __get__
    val = self.func(instance)
  File "/home/runner/work/ensembl-genes/ensembl-genes/ensembl_genes/ensembl_genes.py", line 388, in xref_go_df
    xref_go_df = self.run_query("gene_xrefs_go").merge(
  File "/home/runner/work/ensembl-genes/ensembl-genes/ensembl_genes/ensembl_genes.py", line 69, in run_query
    df = pd.read_sql_query(sql=query, con=self.connection_url)
  File "/home/runner/.cache/pypoetry/virtualenvs/ensembl-genes-GU6ps7Hy-py3.9/lib/python3.9/site-packages/pandas/io/sql.py", line 399, in read_sql_query
    return pandas_sql.read_query(
  File "/home/runner/.cache/pypoetry/virtualenvs/ensembl-genes-GU6ps7Hy-py3.9/lib/python3.9/site-packages/pandas/io/sql.py", line 1554, in read_query
    result = self.execute(*args)
  File "/home/runner/.cache/pypoetry/virtualenvs/ensembl-genes-GU6ps7Hy-py3.9/lib/python3.9/site-packages/pandas/io/sql.py", line 1399, in execute
    return self.connectable.execution_options().execute(*args, **kwargs)
  File "<string>", line 2, in execute
  File "/home/runner/.cache/pypoetry/virtualenvs/ensembl-genes-GU6ps7Hy-py3.9/lib/python3.9/site-packages/sqlalchemy/util/deprecations.py", line 401, in warned
    return fn(*args, **kwargs)
  File "/home/runner/.cache/pypoetry/virtualenvs/ensembl-genes-GU6ps7Hy-py3.9/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 3146, in execute
    return connection.execute(statement, *multiparams, **params)
  File "/home/runner/.cache/pypoetry/virtualenvs/ensembl-genes-GU6ps7Hy-py3.9/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1274, in execute
    return self._exec_driver_sql(
  File "/home/runner/.cache/pypoetry/virtualenvs/ensembl-genes-GU6ps7Hy-py3.9/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1578, in _exec_driver_sql
    ret = self._execute_context(
  File "/home/runner/.cache/pypoetry/virtualenvs/ensembl-genes-GU6ps7Hy-py3.9/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1845, in _execute_context
    self._handle_dbapi_exception(
  File "/home/runner/.cache/pypoetry/virtualenvs/ensembl-genes-GU6ps7Hy-py3.9/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 2026, in _handle_dbapi_exception
    util.raise_(
  File "/home/runner/.cache/pypoetry/virtualenvs/ensembl-genes-GU6ps7Hy-py3.9/lib/python3.9/site-packages/sqlalchemy/util/compat.py", line 207, in raise_
    raise exception
  File "/home/runner/.cache/pypoetry/virtualenvs/ensembl-genes-GU6ps7Hy-py3.9/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1802, in _execute_context
    self.dialect.do_execute(
  File "/home/runner/.cache/pypoetry/virtualenvs/ensembl-genes-GU6ps7Hy-py3.9/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 732, in do_execute
    cursor.execute(statement, parameters)
  File "/home/runner/.cache/pypoetry/virtualenvs/ensembl-genes-GU6ps7Hy-py3.9/lib/python3.9/site-packages/mysql/connector/cursor_cext.py", line 269, in execute
    result = self._cnx.cmd_query(stmt, raw=self._raw,
  File "/home/runner/.cache/pypoetry/virtualenvs/ensembl-genes-GU6ps7Hy-py3.9/lib/python3.9/site-packages/mysql/connector/connection_cext.py", line 528, in cmd_query
    raise errors.get_mysql_exception(exc.errno, msg=exc.msg,
sqlalchemy.exc.OperationalError: (mysql.connector.errors.OperationalError) 2013 (HY000): Lost connection to MySQL server during query
[SQL: -- get Gene Ontology annotations for genes
-- GO xrefs in ensembl are linked to transcripts not genes.
-- Refs internal Related Sciences issue 316.
SELECT
  gene.stable_id AS ensembl_gene_id,
  -- external_db.db_name AS xref_source,
  xref.dbprimary_acc AS go_id,
  -- xref.display_label AS xref_label,
  xref.description AS go_label,
  GROUP_CONCAT(DISTINCT object_xref.linkage_annotation ORDER BY object_xref.linkage_annotation) AS go_evidence_codes,
  GROUP_CONCAT(DISTINCT xref.info_type ORDER BY xref.info_type) AS xref_info_types,
  GROUP_CONCAT(DISTINCT transcript.stable_id ORDER BY transcript.stable_id) AS ensembl_transcript_ids
FROM gene
INNER JOIN transcript 
  ON gene.gene_id = transcript.gene_id 
INNER JOIN object_xref 
  ON transcript.transcript_id = object_xref.ensembl_id 
  AND object_xref.ensembl_object_type = 'Transcript'
INNER JOIN xref 
  ON xref.xref_id = object_xref.xref_id
INNER JOIN external_db 
  ON xref.external_db_id = external_db.external_db_id 
  AND external_db.db_name = 'GO'
WHERE
  -- all genes were current when query was written, ensure this is always the case
  gene.is_current AND
  -- refs internal Related Sciences issue 289.
  gene.biotype != "LRG_gene"
GROUP BY gene.stable_id, external_db.db_name, xref.dbprimary_acc
ORDER BY ensembl_gene_id, go_id
-- LIMIT 10
]
(Background on this error at: https://sqlalche.me/e/14/e3q8)

Might have some helpful info on how we can retry lost connections at https://docs.sqlalchemy.org/en/14/core/pooling.html#pool-disconnects.