monarch-initiative / monarch-app

Monarch Initiative website and API
https://monarchinitiative.org/
BSD 3-Clause "New" or "Revised" License
16 stars 4 forks source link

Add validation of final KG #723

Open kevinschaper opened 1 year ago

kevinschaper commented 1 year ago

I found myself going down a validation rabbit hole this week that wasn't represented in an issue, so I wanted to get it written down and try to nail down for myself what we can achieve and what "done" will look like in the short term.

I've been exploring kgx validation, and I think we can benefit from it, but we need to work around existing limitations.

Too much output

This will likely improve over time, for now kgx is enumerating each and every node or edge with the same problem. @bgood had a nice suggestion in https://github.com/biolink/kgx/issues/354 that I slightly tweaked:

n_error_examples = 50
nn_error_examples = 5
for e in validate['ERROR']:
    error_dict = validate['ERROR'][e]
    n = 0
    print(e)
    for k in error_dict.keys():
        n = n + 1
        if n > n_error_examples:
            print(" " * 4, "...")
            break
        print(" " * 4, k) #, " specific errors:", error_dict[k].keys())
        nn = 0
        for ek in error_dict[k]:
            print(" " * 8, ek, " element count ", len(error_dict[k]))
            nn = nn + 1
            if nn > nn_error_examples:
                print(" " * 8, "...")
                break

This produces pretty good output, but, for example, it still floods with too many examples of the same prefix complaint, which hides other prefix complaints that won't be visible unless the others are fixed, which is something like a 16 hour round trip because it's....

Too slow

kgx validate on the current monarch-kg is taking 10 hours, which is way too long to add to our existing pipeline that is unfortunately ballooning into the 6 hour range. I'm sure this could be massively improved on the kgx side, but even without that, could probably mitigate by putting the validation in it's own Jenkins job that runs after a kg build.

kevinschaper commented 1 year ago

Here's the output I'm getting right now:

INVALID_NODE_PROPERTY_VALUE_TYPE
     Multi-valued node property 'xref' is expected to be of type '<class 'list'>'
         PomBase:SPAC1002.01  element count  56336
         PomBase:SPAC1002.02  element count  56336
         PomBase:SPAC1002.03c  element count  56336
         PomBase:SPAC1002.04c  element count  56336
         PomBase:SPAC1002.05c  element count  56336
         PomBase:SPAC1002.06c  element count  56336
         ...
     Multi-valued node property 'type' is expected to be of type '<class 'list'>'
         PomBase:SPAC1002.01  element count  314203
         PomBase:SPAC1002.02  element count  314203
         PomBase:SPAC1002.03c  element count  314203
         PomBase:SPAC1002.04c  element count  314203
         PomBase:SPAC1002.05c  element count  314203
         PomBase:SPAC1002.06c  element count  314203
         ...
     Single-valued node property 'description' is expected to be of type '<class 'str'>'
         MONDO:0014160  element count  1
INVALID_NODE_PROPERTY_VALUE
     Node property 'id' has a value 'PomBase:SPAC1002.01' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1002.01  element count  1
     Node property 'id' has a value 'PomBase:SPAC1002.02' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1002.02  element count  1
     Node property 'id' has a value 'PomBase:SPAC1002.03c' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1002.03c  element count  1
     Node property 'id' has a value 'PomBase:SPAC1002.04c' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1002.04c  element count  1
     Node property 'id' has a value 'PomBase:SPAC1002.05c' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1002.05c  element count  1
     Node property 'id' has a value 'PomBase:SPAC1002.06c' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1002.06c  element count  1
     Node property 'id' has a value 'PomBase:SPAC1002.07c' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1002.07c  element count  1
     Node property 'id' has a value 'PomBase:SPAC1002.08c' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1002.08c  element count  1
     Node property 'id' has a value 'PomBase:SPAC1002.09c' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1002.09c  element count  1
     Node property 'id' has a value 'PomBase:SPAC1002.10c' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1002.10c  element count  1
     Node property 'id' has a value 'PomBase:SPAC1002.11' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1002.11  element count  1
     Node property 'id' has a value 'PomBase:SPAC1002.12c' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1002.12c  element count  1
     Node property 'id' has a value 'PomBase:SPAC1002.13c' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1002.13c  element count  1
     Node property 'id' has a value 'PomBase:SPAC1002.14' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1002.14  element count  1
     Node property 'id' has a value 'PomBase:SPAC1002.15c' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1002.15c  element count  1
     Node property 'id' has a value 'PomBase:SPAC1002.16c' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1002.16c  element count  1
     Node property 'id' has a value 'PomBase:SPAC1002.17c' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1002.17c  element count  1
     Node property 'id' has a value 'PomBase:SPAC1002.18' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1002.18  element count  1
     Node property 'id' has a value 'PomBase:SPAC1002.19' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1002.19  element count  1
     Node property 'id' has a value 'PomBase:SPAC1002.20' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1002.20  element count  1
     Node property 'id' has a value 'PomBase:SPAC1006.01' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1006.01  element count  1
     Node property 'id' has a value 'PomBase:SPAC1006.02' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1006.02  element count  1
     Node property 'id' has a value 'PomBase:SPAC1006.03c' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1006.03c  element count  1
     Node property 'id' has a value 'PomBase:SPAC1006.04c' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1006.04c  element count  1
     Node property 'id' has a value 'PomBase:SPAC1006.05c' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1006.05c  element count  1
     Node property 'id' has a value 'PomBase:SPAC1006.06' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1006.06  element count  1
     Node property 'id' has a value 'PomBase:SPAC1006.07' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1006.07  element count  1
     Node property 'id' has a value 'PomBase:SPAC1006.08' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1006.08  element count  1
     Node property 'id' has a value 'PomBase:SPAC1006.09' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1006.09  element count  1
     Node property 'id' has a value 'PomBase:SPAC1039.01' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1039.01  element count  1
     Node property 'id' has a value 'PomBase:SPAC1039.02' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1039.02  element count  1
     Node property 'id' has a value 'PomBase:SPAC1039.03' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1039.03  element count  1
     Node property 'id' has a value 'PomBase:SPAC1039.04' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1039.04  element count  1
     Node property 'id' has a value 'PomBase:SPAC1039.05c' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1039.05c  element count  1
     Node property 'id' has a value 'PomBase:SPAC1039.06' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1039.06  element count  1
     Node property 'id' has a value 'PomBase:SPAC1039.07c' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1039.07c  element count  1
     Node property 'id' has a value 'PomBase:SPAC1039.08' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1039.08  element count  1
     Node property 'id' has a value 'PomBase:SPAC1039.09' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1039.09  element count  1
     Node property 'id' has a value 'PomBase:SPAC1039.10' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1039.10  element count  1
     Node property 'id' has a value 'PomBase:SPAC1039.11c' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1039.11c  element count  1
     Node property 'id' has a value 'PomBase:SPAC105.01c' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC105.01c  element count  1
     Node property 'id' has a value 'PomBase:SPAC105.02c' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC105.02c  element count  1
     Node property 'id' has a value 'PomBase:SPAC105.03c' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC105.03c  element count  1
     Node property 'id' has a value 'PomBase:SPAC1071.01c' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1071.01c  element count  1
     Node property 'id' has a value 'PomBase:SPAC1071.02' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1071.02  element count  1
     Node property 'id' has a value 'PomBase:SPAC1071.03c' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1071.03c  element count  1
     Node property 'id' has a value 'PomBase:SPAC1071.04c' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1071.04c  element count  1
     Node property 'id' has a value 'PomBase:SPAC1071.05' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1071.05  element count  1
     Node property 'id' has a value 'PomBase:SPAC1071.06' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1071.06  element count  1
     Node property 'id' has a value 'PomBase:SPAC1071.07c' with a CURIE prefix 'PomBase' is not represented in Biolink Model JSON-LD context
         PomBase:SPAC1071.07c  element count  1
     ...
INVALID_CATEGORY
     Category 'biological_role_mixin' is not in CamelCase form
         CHEBI:130181  element count  379
         CHEBI:131699  element count  379
         CHEBI:136651  element count  379
         CHEBI:138880  element count  379
         CHEBI:139492  element count  379
         CHEBI:139512  element count  379
         ...
     Category 'biological_role_mixin' is a mixin in the Biolink Model
         CHEBI:130181  element count  379
         CHEBI:131699  element count  379
         CHEBI:136651  element count  379
         CHEBI:138880  element count  379
         CHEBI:139492  element count  379
         CHEBI:139512  element count  379
         ...
     Category 'is_metabolite_of' is not in CamelCase form
         CHEBI:131604  element count  42
         CHEBI:137684  element count  42
         CHEBI:25212  element count  42
         CHEBI:25442  element count  42
         CHEBI:26115  element count  42
         CHEBI:27026  element count  42
         ...
     Category 'is_metabolite_of' is unknown in the current Biolink Model
         CHEBI:131604  element count  42
         CHEBI:137684  element count  42
         CHEBI:25212  element count  42
         CHEBI:25442  element count  42
         CHEBI:26115  element count  42
         CHEBI:27026  element count  42
         ...
     Category 'chemical_role_mixin' is not in CamelCase form
         CHEBI:13193  element count  40
         CHEBI:138103  element count  40
         CHEBI:15022  element count  40
         CHEBI:15339  element count  40
         CHEBI:17499  element count  40
         CHEBI:17654  element count  40
         ...
     Category 'chemical_role_mixin' is a mixin in the Biolink Model
         CHEBI:13193  element count  40
         CHEBI:138103  element count  40
         CHEBI:15022  element count  40
         CHEBI:15339  element count  40
         CHEBI:17499  element count  40
         CHEBI:17654  element count  40
         ...
     Category 'RNAProduct' is not in CamelCase form
         CHEBI:18111  element count  3
         CHEBI:33697  element count  3
         CHEBI:33699  element count  3
     Category 'RNAProduct' is unknown in the current Biolink Model
         CHEBI:18111  element count  3
         CHEBI:33697  element count  3
         CHEBI:33699  element count  3
     Category 'derives_from' is not in CamelCase form
         CHEBI:has_functional_parent  element count  1
     Category 'derives_from' is unknown in the current Biolink Model
         CHEBI:has_functional_parent  element count  1
     Category 'subclass_of' is not in CamelCase form
         CHEBI:has_parent_hydride  element count  2
         GO:isa  element count  2
     Category 'subclass_of' is unknown in the current Biolink Model
         CHEBI:has_parent_hydride  element count  2
         GO:isa  element count  2
     Category 'related_to' is not in CamelCase form
         CHEBI:is_conjugate_acid_of  element count  6
         CHEBI:is_conjugate_base_of  element count  6
         GOREL:0002005  element count  6
         GOREL:0012006  element count  6
         MONDO:disease_shares_features_of  element count  6
         UBERON:synapsed_by  element count  6
         ...
     Category 'related_to' is unknown in the current Biolink Model
         CHEBI:is_conjugate_acid_of  element count  6
         CHEBI:is_conjugate_base_of  element count  6
         GOREL:0002005  element count  6
         GOREL:0012006  element count  6
         MONDO:disease_shares_features_of  element count  6
         UBERON:synapsed_by  element count  6
         ...
     Category 'close_match' is not in CamelCase form
         CHEBI:is_enantiomer_of  element count  2
         CHEBI:is_tautomer_of  element count  2
     Category 'close_match' is unknown in the current Biolink Model
         CHEBI:is_enantiomer_of  element count  2
         CHEBI:is_tautomer_of  element count  2
     Category 'part_of' is not in CamelCase form
         CHEBI:is_substituent_group_from  element count  3
         MONDO:part_of_progression_of_disease  element count  3
         UBERON:subdivision_of  element count  3
     Category 'part_of' is unknown in the current Biolink Model
         CHEBI:is_substituent_group_from  element count  3
         MONDO:part_of_progression_of_disease  element count  3
         UBERON:subdivision_of  element count  3
     Category 'causes' is not in CamelCase form
         GOREL:0000040  element count  3
         MONDO:disease_causes_feature  element count  3
         MONDO:disease_triggers  element count  3
     Category 'causes' is unknown in the current Biolink Model
         GOREL:0000040  element count  3
         MONDO:disease_causes_feature  element count  3
         MONDO:disease_triggers  element count  3
     Category 'located_in' is not in CamelCase form
         GOREL:0001004  element count  1
     Category 'located_in' is unknown in the current Biolink Model
         GOREL:0001004  element count  1
     Category 'affects' is not in CamelCase form
         GOREL:0001006  element count  1
     Category 'affects' is unknown in the current Biolink Model
         GOREL:0001006  element count  1
     Category 'affects_localization_of' is not in CamelCase form
         GOREL:0002003  element count  1
     Category 'affects_localization_of' is unknown in the current Biolink Model
         GOREL:0002003  element count  1
     Category 'increases_degradation_of' is not in CamelCase form
         GOREL:0002004  element count  1
     Category 'increases_degradation_of' is unknown in the current Biolink Model
         GOREL:0002004  element count  1
     Category 'Occurrent' is a mixin in the Biolink Model
         GO:0000001  element count  38565
         GO:0000002  element count  38565
         GO:0000003  element count  38565
         GO:0000006  element count  38565
         GO:0000007  element count  38565
         GO:0000009  element count  38565
         ...
     Category 'MacromolecularComplexMixin' is unknown in the current Biolink Model
         GO:0000015  element count  2100
         GO:0000109  element count  2100
         GO:0000110  element count  2100
         GO:0000111  element count  2100
         GO:0000112  element count  2100
         GO:0000113  element count  2100
         ...
     Category 'superclass_of' is not in CamelCase form
         GO:inverse_isa  element count  2
         OMIM:has_manifestation  element count  2
     Category 'superclass_of' is unknown in the current Biolink Model
         GO:inverse_isa  element count  2
         OMIM:has_manifestation  element count  2
     Category 'disease_has_basis_in' is not in CamelCase form
         MONDO:disease_has_basis_in_accumulation_of  element count  2
         MONDO:disease_has_basis_in_development_of  element count  2
     Category 'disease_has_basis_in' is unknown in the current Biolink Model
         MONDO:disease_has_basis_in_accumulation_of  element count  2
         MONDO:disease_has_basis_in_development_of  element count  2
     Category 'disease_has_location' is not in CamelCase form
         MONDO:disease_has_location  element count  1
     Category 'disease_has_location' is unknown in the current Biolink Model
         MONDO:disease_has_location  element count  1
     Category 'has_part' is not in CamelCase form
         MONDO:disease_has_major_feature  element count  1
     Category 'has_part' is unknown in the current Biolink Model
         MONDO:disease_has_major_feature  element count  1
     Category 'treated_by' is not in CamelCase form
         MONDO:disease_responds_to  element count  1
     Category 'treated_by' is unknown in the current Biolink Model
         MONDO:disease_responds_to  element count  1
     Category 'same_as' is not in CamelCase form
         MONDO:equivalentTo  element count  1
     Category 'same_as' is unknown in the current Biolink Model
         MONDO:equivalentTo  element count  1
     Category 'contributes_to' is not in CamelCase form
         MONDO:predisposes_towards  element count  1
     Category 'contributes_to' is unknown in the current Biolink Model
         MONDO:predisposes_towards  element count  1
     Category 'PathologicalEntityMixin' is a mixin in the Biolink Model
         MPATH:0  element count  839
         MPATH:1  element count  839
         MPATH:10  element count  839
         MPATH:100  element count  839
         MPATH:101  element count  839
         MPATH:102  element count  839
         ...
     Category 'has_attribute' is not in CamelCase form
         OMIM:has_inheritance_type  element count  1
     Category 'has_attribute' is unknown in the current Biolink Model
         OMIM:has_inheritance_type  element count  1
     Category 'manifestation_of' is not in CamelCase form
         OMIM:manifestation_of  element count  1
     Category 'manifestation_of' is unknown in the current Biolink Model
         OMIM:manifestation_of  element count  1
     Category 'coexists_with' is not in CamelCase form
         UBERON:anastomoses_with  element count  18
         UBERON:anteriorly_connected_to  element count  18
         UBERON:channel_for  element count  18
         UBERON:channels_from  element count  18
         UBERON:channels_into  element count  18
         UBERON:conduit_for  element count  18
         ...
     ...
INVALID_EDGE_PROPERTY_VALUE_TYPE
     Multi-valued edge property 'has_evidence' is expected to be of type 'list'
         HGNC:11629->MGI:103180  element count  2579391
         HGNC:11629->NCBIGene:514216  element count  2579391
         HGNC:11629->NCBIGene:100152381  element count  2579391
         HGNC:11629->Xenbase:XB-GENE-487230  element count  2579391
         HGNC:11629->ZFIN:ZDB-GENE-060324-2  element count  2579391
         HGNC:11629->WB:WBGene00007227  element count  2579391
         ...
INVALID_EDGE_PROPERTY_VALUE
     Edge property 'subject' has a value 'EMAPA:0' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:0->UBERON:0000061  element count  1
     Edge property 'subject' has a value 'EMAPA:16032' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16032->EMAPA:31859  element count  2
         EMAPA:16032->EMAPA:36041  element count  2
     Edge property 'object' has a value 'EMAPA:31859' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16032->EMAPA:31859  element count  6
         EMAPA:16034->EMAPA:31859  element count  6
         MGI:1920958->EMAPA:31859  element count  6
         MGI:1924378->EMAPA:31859  element count  6
         MGI:96285->EMAPA:31859  element count  6
         MGI:95305->EMAPA:31859  element count  6
         ...
     Edge property 'object' has a value 'EMAPA:36041' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16032->EMAPA:36041  element count  624
         EMAPA:16033->EMAPA:36041  element count  624
         EMAPA:16034->EMAPA:36041  element count  624
         EMAPA:16035->EMAPA:36041  element count  624
         MGI:88354->EMAPA:36041  element count  624
         MGI:95755->EMAPA:36041  element count  624
         ...
     Edge property 'subject' has a value 'EMAPA:16033' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16033->EMAPA:36032  element count  2
         EMAPA:16033->EMAPA:36041  element count  2
     Edge property 'object' has a value 'EMAPA:36032' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16033->EMAPA:36032  element count  14
         EMAPA:16036->EMAPA:36032  element count  14
         EMAPA:16037->EMAPA:36032  element count  14
         EMAPA:16040->EMAPA:36032  element count  14
         MGI:97797->EMAPA:36032  element count  14
         MGI:97798->EMAPA:36032  element count  14
         ...
     Edge property 'subject' has a value 'EMAPA:16034' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16034->EMAPA:31859  element count  7
         EMAPA:16034->EMAPA:36041  element count  7
         EMAPA:16034->EMAPA:36042  element count  7
         EMAPA:16034->EMAPA:36043  element count  7
         EMAPA:16034->EMAPA:36044  element count  7
         EMAPA:16034->EMAPA:36045  element count  7
         ...
     Edge property 'object' has a value 'EMAPA:36042' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16034->EMAPA:36042  element count  1173
         EMAPA:16035->EMAPA:36042  element count  1173
         EMAPA:16036->EMAPA:36042  element count  1173
         MGI:88354->EMAPA:36042  element count  1173
         MGI:101950->EMAPA:36042  element count  1173
         MGI:105384->EMAPA:36042  element count  1173
         ...
     Edge property 'object' has a value 'EMAPA:36043' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16034->EMAPA:36043  element count  779
         EMAPA:16035->EMAPA:36043  element count  779
         EMAPA:31864->EMAPA:36043  element count  779
         MGI:88354->EMAPA:36043  element count  779
         MGI:95298->EMAPA:36043  element count  779
         MGI:106906->EMAPA:36043  element count  779
         ...
     Edge property 'object' has a value 'EMAPA:36044' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16034->EMAPA:36044  element count  829
         EMAPA:16035->EMAPA:36044  element count  829
         EMAPA:31865->EMAPA:36044  element count  829
         MGI:88354->EMAPA:36044  element count  829
         MGI:101950->EMAPA:36044  element count  829
         MGI:106924->EMAPA:36044  element count  829
         ...
     Edge property 'object' has a value 'EMAPA:36045' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16034->EMAPA:36045  element count  1012
         EMAPA:16035->EMAPA:36045  element count  1012
         EMAPA:16040->EMAPA:36045  element count  1012
         MGI:96601->EMAPA:36045  element count  1012
         MGI:96610->EMAPA:36045  element count  1012
         MGI:97073->EMAPA:36045  element count  1012
         ...
     Edge property 'object' has a value 'EMAPA:36046' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16034->EMAPA:36046  element count  1500
         EMAPA:16035->EMAPA:36046  element count  1500
         EMAPA:36035->EMAPA:36046  element count  1500
         MGI:88354->EMAPA:36046  element count  1500
         MGI:95755->EMAPA:36046  element count  1500
         MGI:101950->EMAPA:36046  element count  1500
         ...
     Edge property 'subject' has a value 'EMAPA:16035' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16035->UBERON:0000086  element count  7
         EMAPA:16035->EMAPA:36041  element count  7
         EMAPA:16035->EMAPA:36042  element count  7
         EMAPA:16035->EMAPA:36043  element count  7
         EMAPA:16035->EMAPA:36044  element count  7
         EMAPA:16035->EMAPA:36045  element count  7
         ...
     Edge property 'subject' has a value 'EMAPA:16036' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16036->EMAPA:36032  element count  3
         EMAPA:16036->UBERON:0019249  element count  3
         EMAPA:16036->EMAPA:36042  element count  3
     Edge property 'subject' has a value 'EMAPA:16037' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16037->EMAPA:36032  element count  3
         EMAPA:16037->UBERON:0019250  element count  3
         EMAPA:16037->EMAPA:36063  element count  3
     Edge property 'object' has a value 'EMAPA:36063' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16037->EMAPA:36063  element count  100
         EMAPA:36043->EMAPA:36063  element count  100
         EMAPA:36044->EMAPA:36063  element count  100
         MGI:97171->EMAPA:36063  element count  100
         MGI:97172->EMAPA:36063  element count  100
         MGI:108515->EMAPA:36063  element count  100
         ...
     Edge property 'subject' has a value 'EMAPA:16039' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16039->UBERON:0000922  element count  2
         EMAPA:16039->EMAPA:36040  element count  2
     Edge property 'object' has a value 'EMAPA:36040' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16039->EMAPA:36040  element count  261
         EMAPA:31859->EMAPA:36040  element count  261
         EMAPA:36473->EMAPA:36040  element count  261
         EMAPA:36699->EMAPA:36040  element count  261
         MGI:95661->EMAPA:36040  element count  261
         MGI:96683->EMAPA:36040  element count  261
         ...
     Edge property 'subject' has a value 'EMAPA:16040' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16040->EMAPA:36032  element count  2
         EMAPA:16040->EMAPA:36045  element count  2
     Edge property 'subject' has a value 'EMAPA:16041' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16041->EMAPA:16039  element count  3
         EMAPA:16041->UBERON:0000087  element count  3
         EMAPA:16041->EMAPA:36035  element count  3
     Edge property 'object' has a value 'EMAPA:16039' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16041->EMAPA:16039  element count  5506
         EMAPA:16050->EMAPA:16039  element count  5506
         EMAPA:16060->EMAPA:16039  element count  5506
         EMAPA:16062->EMAPA:16039  element count  5506
         EMAPA:16069->EMAPA:16039  element count  5506
         EMAPA:16071->EMAPA:16039  element count  5506
         ...
     Edge property 'object' has a value 'EMAPA:36035' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16041->EMAPA:36035  element count  306
         EMAPA:16044->EMAPA:36035  element count  306
         EMAPA:16046->EMAPA:36035  element count  306
         EMAPA:16050->EMAPA:36035  element count  306
         EMAPA:16051->EMAPA:36035  element count  306
         EMAPA:16053->EMAPA:36035  element count  306
         ...
     Edge property 'subject' has a value 'EMAPA:16042' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16042->UBERON:0016887  element count  2
         EMAPA:16042->EMAPA:36119  element count  2
     Edge property 'object' has a value 'EMAPA:36119' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16042->EMAPA:36119  element count  14
         EMAPA:16050->EMAPA:36119  element count  14
         EMAPA:16051->EMAPA:36119  element count  14
         MGI:96787->EMAPA:36119  element count  14
         MGI:96788->EMAPA:36119  element count  14
         MGI:96560->EMAPA:36119  element count  14
         ...
     Edge property 'subject' has a value 'EMAPA:16043' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16043->EMAPA:36037  element count  1
     Edge property 'object' has a value 'EMAPA:36037' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16043->EMAPA:36037  element count  4
         EMAPA:16054->EMAPA:36037  element count  4
         EMAPA:16060->EMAPA:36037  element count  4
         EMAPA:36038->EMAPA:36037  element count  4
     Edge property 'subject' has a value 'EMAPA:16044' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16044->EMAPA:16043  element count  3
         EMAPA:16044->UBERON:0000090  element count  3
         EMAPA:16044->EMAPA:36035  element count  3
     Edge property 'object' has a value 'EMAPA:16043' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16044->EMAPA:16043  element count  1
     Edge property 'subject' has a value 'EMAPA:16046' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16046->UBERON:0004345  element count  3
         EMAPA:16046->EMAPA:16042  element count  3
         EMAPA:16046->EMAPA:36035  element count  3
     Edge property 'object' has a value 'EMAPA:16042' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16046->EMAPA:16042  element count  860
         EMAPA:16052->EMAPA:16042  element count  860
         EMAPA:16054->EMAPA:16042  element count  860
         EMAPA:16065->EMAPA:16042  element count  860
         EMAPA:16076->EMAPA:16042  element count  860
         EMAPA:16085->EMAPA:16042  element count  860
         ...
     Edge property 'subject' has a value 'EMAPA:16047' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16047->UBERON:0006265  element count  2
         EMAPA:16047->EMAPA:16046  element count  2
     Edge property 'object' has a value 'EMAPA:16046' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16047->EMAPA:16046  element count  358
         EMAPA:16048->EMAPA:16046  element count  358
         EMAPA:31876->EMAPA:16046  element count  358
         EMAPA:31877->EMAPA:16046  element count  358
         EMAPA:31878->EMAPA:16046  element count  358
         EMAPA:38205->EMAPA:16046  element count  358
         ...
     Edge property 'subject' has a value 'EMAPA:16048' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16048->UBERON:0006280  element count  2
         EMAPA:16048->EMAPA:16046  element count  2
     Edge property 'subject' has a value 'EMAPA:16050' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16050->EMAPA:16039  element count  4
         EMAPA:16050->UBERON:0008780  element count  4
         EMAPA:16050->EMAPA:36035  element count  4
         EMAPA:16050->EMAPA:36119  element count  4
     Edge property 'subject' has a value 'EMAPA:16051' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16051->UBERON:0008776  element count  4
         EMAPA:16051->EMAPA:35986  element count  4
         EMAPA:16051->EMAPA:36035  element count  4
         EMAPA:16051->EMAPA:36119  element count  4
     Edge property 'object' has a value 'EMAPA:35986' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16051->EMAPA:35986  element count  18
         EMAPA:16052->EMAPA:35986  element count  18
         EMAPA:16056->EMAPA:35986  element count  18
         EMAPA:16062->EMAPA:35986  element count  18
         MGI:107191->EMAPA:35986  element count  18
         MGI:1352738->EMAPA:35986  element count  18
         ...
     Edge property 'subject' has a value 'EMAPA:16052' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16052->EMAPA:35986  element count  3
         EMAPA:16052->UBERON:0008945  element count  3
         EMAPA:16052->EMAPA:16042  element count  3
     Edge property 'subject' has a value 'EMAPA:16053' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16053->UBERON:0008800  element count  3
         EMAPA:16053->EMAPA:16052  element count  3
         EMAPA:16053->EMAPA:36035  element count  3
     Edge property 'object' has a value 'EMAPA:16052' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16053->EMAPA:16052  element count  51
         EMAPA:16058->EMAPA:16052  element count  51
         EMAPA:16086->EMAPA:16052  element count  51
         MGI:95713->EMAPA:16052  element count  51
         MGI:103577->EMAPA:16052  element count  51
         MGI:103580->EMAPA:16052  element count  51
         ...
     Edge property 'subject' has a value 'EMAPA:16054' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16054->EMAPA:36037  element count  3
         EMAPA:16054->UBERON:0012466  element count  3
         EMAPA:16054->EMAPA:16042  element count  3
     Edge property 'subject' has a value 'EMAPA:16055' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16055->EMAPA:16054  element count  2
         EMAPA:16055->UBERON:0005251  element count  2
     Edge property 'object' has a value 'EMAPA:16054' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16055->EMAPA:16054  element count  9
         EMAPA:16064->EMAPA:16054  element count  9
         EMAPA:16079->EMAPA:16054  element count  9
         EMAPA:16080->EMAPA:16054  element count  9
         EMAPA:16081->EMAPA:16054  element count  9
         EMAPA:16082->EMAPA:16054  element count  9
         ...
     Edge property 'subject' has a value 'EMAPA:16056' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16056->UBERON:0003064  element count  2
         EMAPA:16056->EMAPA:35986  element count  2
     Edge property 'subject' has a value 'EMAPA:16057' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16057->EMAPA:38030  element count  3
         EMAPA:16057->UBERON:0004369  element count  3
         EMAPA:16057->EMAPA:16053  element count  3
     Edge property 'object' has a value 'EMAPA:38030' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16057->EMAPA:38030  element count  25
         EMAPA:38031->EMAPA:38030  element count  25
         MGI:1328314->EMAPA:38030  element count  25
         MGI:97527->EMAPA:38030  element count  25
         MGI:97530->EMAPA:38030  element count  25
         MGI:97342->EMAPA:38030  element count  25
         ...
     Edge property 'object' has a value 'EMAPA:16053' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16057->EMAPA:16053  element count  60
         EMAPA:35928->EMAPA:16053  element count  60
         MGI:97073->EMAPA:16053  element count  60
         MGI:104328->EMAPA:16053  element count  60
         MGI:95517->EMAPA:16053  element count  60
         MGI:95586->EMAPA:16053  element count  60
         ...
     Edge property 'subject' has a value 'EMAPA:16058' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16058->UBERON:0004877  element count  2
         EMAPA:16058->EMAPA:16052  element count  2
     Edge property 'subject' has a value 'EMAPA:16059' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16059->UBERON:0004364  element count  2
         EMAPA:16059->EMAPA:16048  element count  2
     Edge property 'object' has a value 'EMAPA:16048' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16059->EMAPA:16048  element count  33
         MGI:88180->EMAPA:16048  element count  33
         MGI:88105->EMAPA:16048  element count  33
         MGI:88108->EMAPA:16048  element count  33
         MGI:1201683->EMAPA:16048  element count  33
         MGI:97845->EMAPA:16048  element count  33
         ...
     Edge property 'subject' has a value 'EMAPA:16060' with a CURIE prefix 'EMAPA' that is not represented in Biolink Model JSON-LD context
         EMAPA:16060->EMAPA:36037  element count  3
         EMAPA:16060->UBERON:0005906  element count  3
         EMAPA:16060->EMAPA:16039  element count  3
     ...
monicacecilia commented 2 months ago

We'd like to get additional insight from @amc-corey-cox & @cmungall during an upcoming Data Call. @sagehrke 👀 👆