microbiomedata / nmdc-schema

National Microbiome Data Collaborative (NMDC) unified data model
https://microbiomedata.github.io/nmdc-schema/
Creative Commons Zero v1.0 Universal
27 stars 8 forks source link

`linkml-sqldb dump` crashes in `berkeley-schema-fy24` because of multiple possible `Study.part_of` paths #1893

Open turbomam opened 6 months ago

turbomam commented 6 months ago

@cmungall and @sierra-moxon

linkml-sqldb dump 
  --db credit-example.db \
  --target-class CreditAssociation \
  --schema src/schema/basic_classes.yaml src/data/valid/CreditAssociation-1.yaml
ERROR:root:UNKNOWN range base: None for emsl_project_identifiers = external_identifier
ERROR:root:UNKNOWN range base: None for gnps_task_identifiers = external_identifier
ERROR:root:UNKNOWN range base: None for gold_study_identifiers = external_identifier
ERROR:root:UNKNOWN range base: None for insdc_bioproject_identifiers = external_identifier
ERROR:root:UNKNOWN range base: None for jgi_portal_study_identifiers = external_identifier
ERROR:root:UNKNOWN range base: None for mgnify_project_identifiers = external_identifier
ERROR:root:UNKNOWN range base: None for neon_study_identifiers = external_identifier
ERROR:root:UNKNOWN range base: None for insdc_experiment_identifiers = external_identifier
ERROR:root:UNKNOWN range base: None for gold_sequencing_project_identifiers = external_identifier
ERROR:root:UNKNOWN range base: None for insdc_bioproject_identifiers = external_identifier
ERROR:root:UNKNOWN range base: None for insdc_experiment_identifiers = external_identifier
ERROR:root:UNKNOWN range base: None for gold_analysis_project_identifiers = external_identifier
ERROR:root:UNKNOWN range base: None for jgi_portal_analysis_project_identifiers = external_identifier
ERROR:root:UNKNOWN range base: None for emsl_project_identifiers = external_identifier
ERROR:root:UNKNOWN range base: None for gnps_task_identifiers = external_identifier
ERROR:root:UNKNOWN range base: None for gold_study_identifiers = external_identifier
ERROR:root:UNKNOWN range base: None for insdc_bioproject_identifiers = external_identifier
ERROR:root:UNKNOWN range base: None for jgi_portal_study_identifiers = external_identifier
ERROR:root:UNKNOWN range base: None for mgnify_project_identifiers = external_identifier
ERROR:root:UNKNOWN range base: None for neon_study_identifiers = external_identifier
ERROR:root:UNKNOWN range base: None for insdc_experiment_identifiers = external_identifier
ERROR:root:UNKNOWN range base: None for gold_sequencing_project_identifiers = external_identifier
ERROR:root:UNKNOWN range base: None for insdc_bioproject_identifiers = external_identifier
ERROR:root:UNKNOWN range base: None for insdc_experiment_identifiers = external_identifier
ERROR:root:UNKNOWN range base: None for gold_analysis_project_identifiers = external_identifier
ERROR:root:UNKNOWN range base: None for jgi_portal_analysis_project_identifiers = external_identifier
Traceback (most recent call last):
  File "/home/mark/.cache/pypoetry/virtualenvs/nmdc-schema-gXr5ogK9-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/relationships.py", line 2406, in _determine_joins
    self.secondaryjoin = join_condition(
  File "/home/mark/.cache/pypoetry/virtualenvs/nmdc-schema-gXr5ogK9-py3.10/lib/python3.10/site-packages/sqlalchemy/sql/util.py", line 123, in join_condition
    return Join._join_condition(
  File "/home/mark/.cache/pypoetry/virtualenvs/nmdc-schema-gXr5ogK9-py3.10/lib/python3.10/site-packages/sqlalchemy/sql/selectable.py", line 1343, in _join_condition
    cls._joincond_trim_constraints(
  File "/home/mark/.cache/pypoetry/virtualenvs/nmdc-schema-gXr5ogK9-py3.10/lib/python3.10/site-packages/sqlalchemy/sql/selectable.py", line 1488, in _joincond_trim_constraints
    raise exc.AmbiguousForeignKeysError(
sqlalchemy.exc.AmbiguousForeignKeysError: Can't determine join between 'Study' and 'Study_part_of'; tables have more than one foreign key constraint relationship between them. Please specify the 'onclause' of this join explicitly.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/mark/.cache/pypoetry/virtualenvs/nmdc-schema-gXr5ogK9-py3.10/bin/linkml-sqldb", line 8, in <module>
    sys.exit(main())
  File "/home/mark/.cache/pypoetry/virtualenvs/nmdc-schema-gXr5ogK9-py3.10/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/home/mark/.cache/pypoetry/virtualenvs/nmdc-schema-gXr5ogK9-py3.10/lib/python3.10/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/home/mark/.cache/pypoetry/virtualenvs/nmdc-schema-gXr5ogK9-py3.10/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/mark/.cache/pypoetry/virtualenvs/nmdc-schema-gXr5ogK9-py3.10/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/mark/.cache/pypoetry/virtualenvs/nmdc-schema-gXr5ogK9-py3.10/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/home/mark/.cache/pypoetry/virtualenvs/nmdc-schema-gXr5ogK9-py3.10/lib/python3.10/site-packages/linkml/utils/sqlutils.py", line 403, in dump
    endpoint.dump(obj)
  File "/home/mark/.cache/pypoetry/virtualenvs/nmdc-schema-gXr5ogK9-py3.10/lib/python3.10/site-packages/linkml/utils/sqlutils.py", line 169, in dump
    nu_obj = self.to_sqla(element)
  File "/home/mark/.cache/pypoetry/virtualenvs/nmdc-schema-gXr5ogK9-py3.10/lib/python3.10/site-packages/linkml/utils/sqlutils.py", line 220, in to_sqla
    v2 = self.to_sqla(v)
  File "/home/mark/.cache/pypoetry/virtualenvs/nmdc-schema-gXr5ogK9-py3.10/lib/python3.10/site-packages/linkml/utils/sqlutils.py", line 227, in to_sqla
    nu_obj = nu_typ(**inst_args)
  File "<string>", line 4, in __init__
  File "/home/mark/.cache/pypoetry/virtualenvs/nmdc-schema-gXr5ogK9-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/state.py", line 559, in _initialize_instance
    manager.dispatch.init(self, args, kwargs)
  File "/home/mark/.cache/pypoetry/virtualenvs/nmdc-schema-gXr5ogK9-py3.10/lib/python3.10/site-packages/sqlalchemy/event/attr.py", line 497, in __call__
    fn(*args, **kw)
  File "/home/mark/.cache/pypoetry/virtualenvs/nmdc-schema-gXr5ogK9-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/mapper.py", line 4395, in _event_on_init
    instrumenting_mapper._check_configure()
  File "/home/mark/.cache/pypoetry/virtualenvs/nmdc-schema-gXr5ogK9-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/mapper.py", line 2388, in _check_configure
    _configure_registries({self.registry}, cascade=True)
  File "/home/mark/.cache/pypoetry/virtualenvs/nmdc-schema-gXr5ogK9-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/mapper.py", line 4203, in _configure_registries
    _do_configure_registries(registries, cascade)
  File "/home/mark/.cache/pypoetry/virtualenvs/nmdc-schema-gXr5ogK9-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/mapper.py", line 4244, in _do_configure_registries
    mapper._post_configure_properties()
  File "/home/mark/.cache/pypoetry/virtualenvs/nmdc-schema-gXr5ogK9-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/mapper.py", line 2405, in _post_configure_properties
    prop.init()
  File "/home/mark/.cache/pypoetry/virtualenvs/nmdc-schema-gXr5ogK9-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/interfaces.py", line 584, in init
    self.do_init()
  File "/home/mark/.cache/pypoetry/virtualenvs/nmdc-schema-gXr5ogK9-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/relationships.py", line 1644, in do_init
    self._setup_join_conditions()
  File "/home/mark/.cache/pypoetry/virtualenvs/nmdc-schema-gXr5ogK9-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/relationships.py", line 1884, in _setup_join_conditions
    self._join_condition = jc = JoinCondition(
  File "/home/mark/.cache/pypoetry/virtualenvs/nmdc-schema-gXr5ogK9-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/relationships.py", line 2310, in __init__
    self._determine_joins()
  File "/home/mark/.cache/pypoetry/virtualenvs/nmdc-schema-gXr5ogK9-py3.10/lib/python3.10/site-packages/sqlalchemy/orm/relationships.py", line 2455, in _determine_joins
    raise sa_exc.AmbiguousForeignKeysError(
sqlalchemy.exc.AmbiguousForeignKeysError: Could not determine join condition between parent/child tables on relationship Study.part_of - there are multiple foreign key paths linking the tables via secondary table 'Study_part_of'.  Specify the 'foreign_keys' argument, providing a list of those columns which should be counted as containing a foreign key reference from the secondary table to each of the parent and child tables.