HumanCellAtlas / dcp2

Shared artifacts concerning the Human Cell Atlas (HCA) Data Coordination Platform (DCP)
4 stars 2 forks source link

Donor with age but no unit in project 403c3e76-6814-4a2d-a580-5dd5de38c7ff #31

Closed hannes-ucsc closed 3 years ago

hannes-ucsc commented 3 years ago

One donor in that project "The Developmental Heterogeneity of Human Natural Killer Cells Defined by Single-cell Transcriptome" has the organism_age_unit property but lacks the organism_age property. This trips an assertion in Azul during indexing. The schema allows this combination but it doesn't make sense to provide a unit and no value. The schema should be updated to disallow this combination and the project metadata needs to be updated to either provide organism_age or remove organism_age_unit.

Affected subgraphs

donor_organism.json:

{
        "describedBy": "https://schema.humancellatlas.org/type/biomaterial/15.5.0/donor_organism",
        "schema_type": "biomaterial",
        "biomaterial_core": {
            "biomaterial_id": "Donor_7",
            "biomaterial_name": "Blood donor with GATA2T354M mutation",
            "biomaterial_description": "Blood donor with GATA2T354M mutation",
            "ncbi_taxon_id": [
                9606
            ],
            "genotype": "GATA2T354M"
        },
        "genus_species": [
            {
                "text": "Homo sapiens",
                "ontology": "NCBITaxon:9606",
                "ontology_label": "Homo sapiens"
            }
        ],
        "sex": "female",
        "is_living": "yes",
        "organism_age_unit": {
            "text": "year",
            "ontology": "UO:0000036",
            "ontology_label": "year"
        },
        "development_stage": {
            "text": "human adult stage",
            "ontology": "HsapDv:0000087",
            "ontology_label": "human adult stage"
        },
        "diseases": [
            {
                "text": "GATA2T354M",
                "ontology": "MONDO:0042982",
                "ontology_label": "GATA2 deficiency with susceptibility to MDS/AML"
            }
        ],
        "provenance": {
            "document_id": "f49ea97a-a586-48c6-8f6d-b3e0eaae0806",
            "submission_date": "2021-06-09T19:31:52.242Z",
            "update_date": "2021-06-09T19:32:01.739Z",
            "schema_major_version": 15,
            "schema_minor_version": 5
        }

Tracback in Azul:

[WARNING] 2021-07-03T15:36:37.396Z 50a723b8-3af6-5c0f-b1be-7ae2bcd2a81d Worker failed to handle message {'action': 'add', 'notification': {'source': {'id': '83752bf1-44fa-46d8-abec-0f5982e21cdd', 'spec': 'tdr:broad-datarepo-terra-prod-hca2:snapshot/hca_prod_20201120_dcp2__20210701_dcp7:'}, 'query': {}, 'subscription_id': 'cafebabe-feed-4bad-dead-beaf8badf00d', 'transaction_id': '431d5990-7979-4d4d-9fd3-6bfcbf19c92b', 'match': {'bundle_uuid': 'efa53f08-e879-4f35-959c-a978ac998df1', 'bundle_version': '2021-06-09T19:31:55.438000Z'}}, 'catalog': 'dcp7'}.
Traceback (most recent call last):
File "/var/task/azul/indexer/index_controller.py", line 153, in contribute
contributions = self.transform(catalog, notification, delete)
File "/var/task/azul/indexer/index_controller.py", line 188, in transform
return self.index_service.transform(catalog, bundle, delete)
File "/var/task/azul/indexer/index_service.py", line 174, in transform
contributions.extend(transformer.transform())
File "/var/task/azul/plugins/metadata/hca/transform.py", line 1225, in transform
donors=list(map(self._donor, visitor.donors.values())),
File "/var/task/azul/plugins/metadata/hca/transform.py", line 668, in _donor
require(donor.organism_age_unit is None)
File "/var/task/azul/__init__.py", line 1087, in require
reject(not condition, *args, exception=exception)
File "/var/task/azul/__init__.py", line 1102, in reject
raise exception(*args)
azul.RequirementError
theathorn commented 3 years ago

This project is excluded from the later dcp7 snapshot (DataBiosphere/Azul#3209).

hannes-ucsc commented 3 years ago

When reindexing prod I saw no errors related to these subgraphs.