bio2bel / hmdb

A Bio2BEL package for converting the Human Metabolite Database (HMDB) to BEL
http://bio2bel-hmdb.rtfd.io/
MIT License
5 stars 1 forks source link

String field in hmdb_metabolite is too short #7

Open cthoyt opened 5 years ago

cthoyt commented 5 years ago

One of the string fields in hmdb_metabolite isn't long enough. Which one is it? How long should it be to accommodate this data?

$ bio2bel_hmdb populate
2019-02-24 20:44:47,445 - bio2bel_hmdb.parser - INFO - downloading http://www.hmdb.ca/system/downloads/current/hmdb_metabolites.zip to /Users/cthoyt/.bio2bel/hmdb/hmdb_metabolites.zip
2019-02-24 20:45:48,004 - bio2bel_hmdb.parser - INFO - extracting /Users/cthoyt/.bio2bel/hmdb/hmdb_metabolites.zip to /Users/cthoyt/.bio2bel/hmdb/hmdb_metabolites.xml
2019-02-24 20:46:24,818 - bio2bel_hmdb.parser - INFO - parsing /Users/cthoyt/.bio2bel/hmdb/hmdb_metabolites.xml
2019-02-24 20:59:25,122 - bio2bel_hmdb.parser - INFO - done parsing after 780.25 seconds
HMDB Metabolite: 100%|███████████████████████████████████████████████████████████████████████████| 114100/114100 [17:29<00:00, 108.70it/s]Traceback (most recent call last):
  File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1236, in _execute_context
    cursor, statement, parameters, context
  File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 536, in do_execute
    cursor.execute(statement, parameters)
psycopg2.DataError: value too long for type character varying(255)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/cthoyt/dev/bio2bel/src/bio2bel/manager/abstract_manager.py", line 38, in populate_wrapped
    cls._populate_original(self, *populate_args, **populate_kwargs)
  File "/Users/cthoyt/dev/hmdb/src/bio2bel_hmdb/manager.py", line 324, in populate
    self.session.commit()
  File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/sqlalchemy/orm/scoping.py", line 162, in do
    return getattr(self.registry(), name)(*args, **kwargs)
  File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 1023, in commit
    self.transaction.commit()
  File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 487, in commit
    self._prepare_impl()
  File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 466, in _prepare_impl
    self.session.flush()
  File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 2446, in flush
    self._flush(objects)
  File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 2584, in _flush
    transaction.rollback(_capture_exception=True)
  File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/sqlalchemy/util/langhelpers.py", line 67, in __exit__
    compat.reraise(exc_type, exc_value, exc_tb)
  File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 277, in reraise
    raise value
  File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 2544, in _flush
    flush_context.execute()
  File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/sqlalchemy/orm/unitofwork.py", line 416, in execute
    rec.execute(self)
  File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/sqlalchemy/orm/unitofwork.py", line 583, in execute
    uow,
  File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/sqlalchemy/orm/persistence.py", line 245, in save_obj
    insert,
  File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/sqlalchemy/orm/persistence.py", line 1116, in _emit_insert_statements
    statement, params
  File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 980, in execute
    return meth(self, multiparams, params)
  File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/sqlalchemy/sql/elements.py", line 273, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1099, in _execute_clauseelement
    distilled_params,
  File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1240, in _execute_context
    e, statement, parameters, cursor, context
  File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1458, in _handle_dbapi_exception
    util.raise_from_cause(sqlalchemy_exception, exc_info)
  File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 296, in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb, cause=cause)
  File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 276, in reraise
    raise value.with_traceback(tb)
  File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1236, in _execute_context
    cursor, statement, parameters, context
  File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 536, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.DataError: (psycopg2.DataError) value too long for type character varying(255)
 [SQL: 'INSERT INTO hmdb_metabolite (version, creation_date, update_date, accession, name, description, chemical_formula, average_molecular_weight, monisotopic_molecular_weight, iupac_name, traditional_iupac, trivial, cas_registry_number, smiles, inchi, inchikey, state, drugbank_id, drugbank_metabolite_id, phenol_explorer_compound_id, phenol_explorer_metabolite_id, foodb_id, knapsack_id, chemspider_id, kegg_id, biocyc_id, bigg_id, wikipedia, nugowiki, metagene, metlin_id, pubchem_compound_id, het_id, chebi_id, synthesis_reference) VALUES (%(version)s, %(creation_date)s, %(update_date)s, %(accession)s, %(name)s, %(description)s, %(chemical_formula)s, %(average_molecular_weight)s, %(monisotopic_molecular_weight)s, %(iupac_name)s, %(traditional_iupac)s, %(trivial)s, %(cas_registry_number)s, %(smiles)s, %(inchi)s, %(inchikey)s, %(state)s, %(drugbank_id)s, %(drugbank_metabolite_id)s, %(phenol_explorer_compound_id)s, %(phenol_explorer_metabolite_id)s, %(foodb_id)s, %(knapsack_id)s, %(chemspider_id)s, %(kegg_id)s, %(biocyc_id)s, %(bigg_id)s, %(wikipedia)s, %(nugowiki)s, %(metagene)s, %(metlin_id)s, %(pubchem_compound_id)s, %(het_id)s, %(chebi_id)s, %(synthesis_reference)s) RETURNING hmdb_metabolite.id'] [parameters: {'version': '4.0', 'creation_date': '2005-11-16 15:48:42 UTC', 'update_date': '2019-01-11 19:13:56 UTC', 'accession': 'HMDB0000001', 'name': '1-Methylhistidine', 'description': "One-methylhistidine (1-MHis) is derived mainly from the anserine of dietary flesh sources, especially poultry. The enzyme, carnosinase, splits anseri ... (465 characters truncated) ... lhistidinuria from increased oxidative effects in skeletal muscle. 1-Methylhistidine is a biomarker for the consumption of meat, especially red meat.", 'chemical_formula': 'C7H11N3O2', 'average_molecular_weight': '169.1811', 'monisotopic_molecular_weight': '169.085126611', 'iupac_name': '(2S)-2-amino-3-(1-methyl-1H-imidazol-4-yl)propanoic acid', 'traditional_iupac': '1 methylhistidine', 'trivial': None, 'cas_registry_number': '332-80-9', 'smiles': 'CN1C=NC(C[C@H](N)C(O)=O)=C1', 'inchi': 'InChI=1S/C7H11N3O2/c1-10-3-5(9-4-10)2-6(8)7(11)12/h3-4,6H,2,8H2,1H3,(H,11,12)/t6-/m0/s1', 'inchikey': 'BRMWTNUJHUMWMS-LURJTMIESA-N', 'state': 'Solid', 'drugbank_id': 'DB04151', 'drugbank_metabolite_id': None, 'phenol_explorer_compound_id': None, 'phenol_explorer_metabolite_id': None, 'foodb_id': 'FDB012119', 'knapsack_id': None, 'chemspider_id': '83153', 'kegg_id': 'C01152', 'biocyc_id': None, 'bigg_id': None, 'wikipedia': None, 'nugowiki': None, 'metagene': None, 'metlin_id': '3741', 'pubchem_compound_id': '92105', 'het_id': None, 'chebi_id': '50599', 'synthesis_reference': 'Jain, Rahul; Cohen, Louis A. Regiospecific alkylation of histidine and histamine at N-1 (t).Tetrahedron  (1996),  52(15),  5363-70.'}] (Background on this error at: http://sqlalche.me/e/9h9h)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/cthoyt/.virtualenvs/hbp/bin/bio2bel_hmdb", line 11, in <module>
    load_entry_point('bio2bel-hmdb', 'console_scripts', 'bio2bel_hmdb')()
  File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/click/decorators.py", line 27, in new_func
    return f(get_current_context().obj, *args, **kwargs)
  File "/Users/cthoyt/dev/bio2bel/src/bio2bel/manager/abstract_manager.py", line 325, in populate
    def add_cli_drop(main: click.Group) -> click.Group:  # noqa: D202
  File "/Users/cthoyt/dev/bio2bel/src/bio2bel/manager/abstract_manager.py", line 40, in populate_wrapped
    self._store_populate_failed()
  File "/Users/cthoyt/dev/bio2bel/src/bio2bel/manager/connection_manager.py", line 93, in _store_populate_failed
    Action.store_populate_failed(self.module_name, session=self.session)
  File "/Users/cthoyt/dev/bio2bel/src/bio2bel/models.py", line 95, in store_populate_failed
    _store_helper(action, session=session)
  File "/Users/cthoyt/dev/bio2bel/src/bio2bel/models.py", line 140, in _store_helper
    session.commit()
  File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/sqlalchemy/orm/scoping.py", line 162, in do
    return getattr(self.registry(), name)(*args, **kwargs)
  File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 1023, in commit
    self.transaction.commit()
  File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 485, in commit
    self._assert_active(prepared_ok=True)
  File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 288, in _assert_active
    % self._rollback_exception
sqlalchemy.exc.InvalidRequestError: This Session's transaction has been rolled back due to a previous exception during flush. To begin a new transaction with this Session, first issue Session.rollback(). Original exception was: (psycopg2.DataError) value too long for type character varying(255)
 [SQL: 'INSERT INTO hmdb_metabolite (version, creation_date, update_date, accession, name, description, chemical_formula, average_molecular_weight, monisotopic_molecular_weight, iupac_name, traditional_iupac, trivial, cas_registry_number, smiles, inchi, inchikey, state, drugbank_id, drugbank_metabolite_id, phenol_explorer_compound_id, phenol_explorer_metabolite_id, foodb_id, knapsack_id, chemspider_id, kegg_id, biocyc_id, bigg_id, wikipedia, nugowiki, metagene, metlin_id, pubchem_compound_id, het_id, chebi_id, synthesis_reference) VALUES (%(version)s, %(creation_date)s, %(update_date)s, %(accession)s, %(name)s, %(description)s, %(chemical_formula)s, %(average_molecular_weight)s, %(monisotopic_molecular_weight)s, %(iupac_name)s, %(traditional_iupac)s, %(trivial)s, %(cas_registry_number)s, %(smiles)s, %(inchi)s, %(inchikey)s, %(state)s, %(drugbank_id)s, %(drugbank_metabolite_id)s, %(phenol_explorer_compound_id)s, %(phenol_explorer_metabolite_id)s, %(foodb_id)s, %(knapsack_id)s, %(chemspider_id)s, %(kegg_id)s, %(biocyc_id)s, %(bigg_id)s, %(wikipedia)s, %(nugowiki)s, %(metagene)s, %(metlin_id)s, %(pubchem_compound_id)s, %(het_id)s, %(chebi_id)s, %(synthesis_reference)s) RETURNING hmdb_metabolite.id'] [parameters: {'version': '4.0', 'creation_date': '2005-11-16 15:48:42 UTC', 'update_date': '2019-01-11 19:13:56 UTC', 'accession': 'HMDB0000001', 'name': '1-Methylhistidine', 'description': "One-methylhistidine (1-MHis) is derived mainly from the anserine of dietary flesh sources, especially poultry. The enzyme, carnosinase, splits anseri ... (465 characters truncated) ... lhistidinuria from increased oxidative effects in skeletal muscle. 1-Methylhistidine is a biomarker for the consumption of meat, especially red meat.", 'chemical_formula': 'C7H11N3O2', 'average_molecular_weight': '169.1811', 'monisotopic_molecular_weight': '169.085126611', 'iupac_name': '(2S)-2-amino-3-(1-methyl-1H-imidazol-4-yl)propanoic acid', 'traditional_iupac': '1 methylhistidine', 'trivial': None, 'cas_registry_number': '332-80-9', 'smiles': 'CN1C=NC(C[C@H](N)C(O)=O)=C1', 'inchi': 'InChI=1S/C7H11N3O2/c1-10-3-5(9-4-10)2-6(8)7(11)12/h3-4,6H,2,8H2,1H3,(H,11,12)/t6-/m0/s1', 'inchikey': 'BRMWTNUJHUMWMS-LURJTMIESA-N', 'state': 'Solid', 'drugbank_id': 'DB04151', 'drugbank_metabolite_id': None, 'phenol_explorer_compound_id': None, 'phenol_explorer_metabolite_id': None, 'foodb_id': 'FDB012119', 'knapsack_id': None, 'chemspider_id': '83153', 'kegg_id': 'C01152', 'biocyc_id': None, 'bigg_id': None, 'wikipedia': None, 'nugowiki': None, 'metagene': None, 'metlin_id': '3741', 'pubchem_compound_id': '92105', 'het_id': None, 'chebi_id': '50599', 'synthesis_reference': 'Jain, Rahul; Cohen, Louis A. Regiospecific alkylation of histidine and histamine at N-1 (t).Tetrahedron  (1996),  52(15),  5363-70.'}] (Background on this error at: http://sqlalche.me/e/9h9h)