NCATSTranslator / Feedback

A repo for tracking gaps in Translator data and finding ways to fill them.
7 stars 0 forks source link

ARAGORN TRAPI Warnings #811

Open sstemann opened 3 months ago

sstemann commented 3 months ago

This PK: be8dbb09-25da-4592-b00c-289922a5bf2c

Shows some warnings for ARAGORN

https://arax.ncats.io/?r=d6a12851-c7da-443b-a333-15d816f3a8af

https://arax.ci.transltr.io/?r=be8dbb09-25da-4592-b00c-289922a5bf2c

I can still view the results in the ARAX GUI

image

cbizon commented 3 months ago

ARAGORN is the ARA, but a lot of these are coming from upstream sources. I'll open new issues here and link them to this issue.

cbizon commented 3 months ago

For this warning:

=> Edge has an attribute_type_id that has a non-Biolink CURIE prefix mapped to Biolink

I don't think we want to do anything. The TRAPI docs say this is ok as long as there's no good replacement in biolink. The two attributes that are warning are dct:description bts:sentence Neither of which has a biolink replacement, so no action to take.

I do think it's worth getting this warning to force a review of whether the attributes types are correct, but in this case, I think that they are.

cbizon commented 3 months ago

For this warning:

=> Edge has an 'attribute_type_id' that is a category
            $ infores:ctd -> infores:automat-ctd -> infores:aragorn
                # biolink:Attribute

I think that this is saying that we have an edge where the type_attribute_id is biolink:Attribute but it should be something else.

@EvanDietzMorris can you take a look?

sstemann commented 3 months ago

i'm adding another example, it it MVP1 for Bethlem Myopathy (run 7/5 - not the cached)

https://ui.test.transltr.io/main/results?l=Bethlem%20Myopathy&i=MONDO:0008029&t=0&r=0&q=a04bff0e-1b72-494c-ae67-6f1fbc7aa66e

image

colleenXu commented 3 months ago

Note (similar to https://github.com/NCATSTranslator/Feedback/issues/814#issuecomment-2216896382 and https://github.com/NCATSTranslator/Feedback/issues/815#issuecomment-2216905247):

I don't think bts:sentence is coming from Service Provider/BioThings SemmedDB. Service Provider doesn't create this edge-attribute for anything in BioThings SemmedDB - so it shouldn't be coming from this tool.

Instead, I think it's coming from RTX-KG2 alone.

I reviewed two edges in Aragorn from the first link. Both edges have bts:sentence edge-attribute + the "biothings-semmeddb" source (see collapsed sections below). In both cases, it looks like something (Aragorn?) merged edges from RTX-KG2 and Service Provider into 1 edge, so the edge-attributes and sources are a mix of both KP's data. Ex: I can see two sets of publication edge-attributes and two lines of provenance in the sources.

Oddly, for both edges aragorn only lists 1 upstream resource, rather than listing both KPs...

Edge `2660b095590a`

``` { "attributes": [ { "attribute_type_id": "biolink:agent_type", "value": "text_mining_agent" }, { "attribute_type_id": "biolink:publications", "value": [ "PMID:15095704", "PMID:18085095", "PMID:8678396", "PMID:19293073", "PMID:21592450" ], "value_type_id": "linkml:Uriorcurie" }, { "attribute_type_id": "biolink:original_predicate", "description": "The IDs of the original RTX-KG2pre edge(s) corresponding to this edge prior to any synonymization or remapping.", "value": [ "UMLS:C0004057---SEMMEDDB:augments---biolink:causes---activity_or_abundance---increased---UMLS:C2937358---SEMMEDDB:", "UMLS:C0004057---SEMMEDDB:augments---biolink:causes---activity_or_abundance---increased---UMLS:C0553692---SEMMEDDB:" ], "value_type_id": "metatype:String" }, { "attribute_source": "infores:semmeddb", "attribute_type_id": "bts:sentence", "value": { "PMID:15095704": { "object score": "1000", "publication date": "2003 Mar", "sentence": "With aspirin there is a 1.5 fold increase of hemorrhagic stroke and a 2 fold increase of gastrointestinal hemorrhage.", "subject score": "1000" }, "PMID:18085095": { "object score": "1000", "publication date": "2007 Oct", "sentence": "Aspirin use slightly increases rates of gastrointestinal bleeding and hemorrhagic stroke.", "subject score": "1000" }, "PMID:19293073": { "object score": "983", "publication date": "2009 Mar 17", "sentence": "Does aspirin increase gastrointestinal bleeding or hemorrhagic strokes?", "subject score": "1000" }, "PMID:21592450": { "object score": "1000", "publication date": "2011 Jul", "sentence": "CONCLUSION: Aspirin prevents deaths, myocardial infarction, and ischemic stroke, and increases hemorrhagic stroke and major bleeding when used in the primary prevention of cardiovascular disease.", "subject score": "1000" }, "PMID:35121011": { "object score": "873", "publication date": "2022 Feb 01", "sentence": "On one hand, ASA and CLOP single treatments increase the post-TBI ICH risk, with a further detrimental effect from the ASA + CLOP treatment.", "subject score": "1000" }, "PMID:8678396": { "object score": "1000", "publication date": "1996 Aug 15", "sentence": "Although warfarin is more effective than aspirin in preventing embolic strokes in patients older than 75 years of age, it may increase the incidence of hemorrhagic stroke and result in a similar rate of disabling stroke.", "subject score": "1000" } } }, { "attribute_source": "infores:semmeddb", "attribute_type_id": "biolink:publications", "value": [ "PMID:35121011", "PMID:15095704", "PMID:21592450", "PMID:8678396", "PMID:19293073", "PMID:18085095" ], "value_type_id": "biolink:Uriorcurie" }, { "attribute_type_id": "biolink:knowledge_level", "value": "not_provided" }, { "attribute_type_id": "biolink:supporting_text", "value": [ "With aspirin there is a 1.5 fold increase of hemorrhagic stroke and a 2 fold increase of gastrointestinal hemorrhage.", "Aspirin use slightly increases rates of gastrointestinal bleeding and hemorrhagic stroke.", "Although warfarin is more effective than aspirin in preventing embolic strokes in patients older than 75 years of age, it may increase the incidence of hemorrhagic stroke and result in a similar rate of disabling stroke.", "Does aspirin increase gastrointestinal bleeding or hemorrhagic strokes?", "CONCLUSION: Aspirin prevents deaths, myocardial infarction, and ischemic stroke, and increases hemorrhagic stroke and major bleeding when used in the primary prevention of cardiovascular disease." ] } ], "object": "MONDO:0013792", "predicate": "biolink:affects", "qualifiers": [ { "qualifier_type_id": "biolink:object_aspect_qualifier", "qualifier_value": "activity_or_abundance" }, { "qualifier_type_id": "biolink:object_direction_qualifier", "qualifier_value": "increased" }, { "qualifier_type_id": "biolink:qualified_predicate", "qualifier_value": "biolink:causes" } ], "sources": [ { "resource_id": "infores:semmeddb", "resource_role": "primary_knowledge_source", "upstream_resource_ids": [] }, { "resource_id": "infores:service-provider-trapi", "resource_role": "aggregator_knowledge_source", "upstream_resource_ids": [ "infores:biothings-semmeddb" ] }, { "resource_id": "infores:rtx-kg2", "resource_role": "aggregator_knowledge_source", "upstream_resource_ids": [ "infores:semmeddb" ] }, { "resource_id": "infores:biothings-semmeddb", "resource_role": "aggregator_knowledge_source", "upstream_resource_ids": [ "infores:semmeddb" ] }, { "resource_id": "infores:aragorn", "resource_role": "aggregator_knowledge_source", "upstream_resource_ids": [ "infores:service-provider-trapi" ] } ], "subject": "CHEBI:15365" } ```

Edge `0f685903de2f`

``` { "attributes": [ { "attribute_type_id": "biolink:original_predicate", "description": "The IDs of the original RTX-KG2pre edge(s) corresponding to this edge prior to any synonymization or remapping.", "value": [ "UMLS:C0600508---SEMMEDDB:part_of---None---None---None---UMLS:C0041014---SEMMEDDB:" ], "value_type_id": "metatype:String" }, { "attribute_type_id": "biolink:agent_type", "value": "text_mining_agent" }, { "attribute_source": "infores:semmeddb", "attribute_type_id": "biolink:publications", "value": [ "PMID:9100577", "PMID:8663263", "PMID:8900406", "PMID:11907029", "PMID:18952156", "PMID:10524577" ], "value_type_id": "biolink:Uriorcurie" }, { "attribute_type_id": "biolink:supporting_text", "value": [ "The chicken malic enzyme gene: structural organization and identification of triiodothyronine response elements in the 5'-flanking DNA.", "In hepatocytes transiently transfected with plasmids containing triiodothyronine response elements and a minimal promoter from the malic enzyme gene linked to the chloramphenicol acetyltransferase gene, deletion of the PPY/PPU tract inhibited chloramphenicol acetyltransferase activity by about 90% with or without triiodothyronine.", "Homodimer binding of TR beta-EZ to DR4- and F2-T3 response elements (TREs) was weaker, and to a palindromic TRE (PAL) was stronger than that of wild-type TR beta (TR beta-WT) in the absence of T3.", "This difference in T3 responsiveness was also observed when T3-responsive reporters consisting of the luciferase gene under the control of triiodothyronine response element (TRE) were introduced into hepatocytes using a replication-defective adenovirus vector.", "These results suggest that the factors required for T3-dependent transcriptional activation are preserved in spheroid cultures and that they must exert their effect by interacting with TRE.", "In previous work, we characterized a 3,5,3'-triiodothyronine response element (T3RE) in acetyl-CoA carboxylase-alpha (ACCalpha) promoter 2 that mediated 3,5,3'-triiodothyronine (T3) regulation of ACCalpha transcription in chick embryo hepatocytes.", "Sequence comparison analysis revealed the presence of sterol regulatory element-1 (SRE-1) located 5 bp downstream of the ACCalpha T3RE.", "The effect of the SRE-1 on T3 responsiveness required the presence of the T3RE in its native orientation.", "In gel mobility shift experiments, TRalpha, retinoid X receptor-alpha, and mature SREBP-1 formed a tetrameric complex on a DNA probe containing the ACCalpha T3RE and SRE-1, and the presence of T3 enhanced the formation of this complex.", "This region contains a putative triiodothyronine response element (T3RE) that differs from the human ME1 T3RE by two nucleotides.", "When the human ME1 T3RE was introduced into the ovine ME1 promoter context, transcriptional activity was increased in the hepatic cell lines HepG2 and H4IIE but not in differentiated 3T3-L1 cells.", "Our results suggest that the sequence of the T3RE in the ME1 promoter determines differences in the tissue/species activity of malic enzyme in ruminants and human." ] }, { "attribute_source": "infores:semmeddb", "attribute_type_id": "bts:sentence", "value": { "PMID:10524577": { "object score": "901", "publication date": "1999 Sep", "sentence": "These results suggest that the factors required for T3-dependent transcriptional activation are preserved in spheroid cultures and that they must exert their effect by interacting with TRE.", "subject score": "901" }, "PMID:11907029": { "object score": "824", "publication date": "2002 May 31", "sentence": "In gel mobility shift experiments, TRalpha, retinoid X receptor-alpha, and mature SREBP-1 formed a tetrameric complex on a DNA probe containing the ACCalpha T3RE and SRE-1, and the presence of T3 enhanced the formation of this complex.", "subject score": "824" }, "PMID:18952156": { "object score": "901", "publication date": "2009 Jan 01", "sentence": "Our results suggest that the sequence of the T3RE in the ME1 promoter determines differences in the tissue/species activity of malic enzyme in ruminants and human.", "subject score": "901" }, "PMID:8663263": { "object score": "901", "publication date": "1996 Jul 05", "sentence": "In hepatocytes transiently transfected with plasmids containing triiodothyronine response elements and a minimal promoter from the malic enzyme gene linked to the chloramphenicol acetyltransferase gene, deletion of the PPY/PPU tract inhibited chloramphenicol acetyltransferase activity by about 90% with or without triiodothyronine.", "subject score": "901" }, "PMID:8900406": { "object score": "901", "publication date": "1996 Oct 15", "sentence": "The chicken malic enzyme gene: structural organization and identification of triiodothyronine response elements in the 5'-flanking DNA.", "subject score": "901" }, "PMID:9100577": { "object score": "824", "publication date": "1997 Apr", "sentence": "Homodimer binding of TR beta-EZ to DR4- and F2-T3 response elements (TREs) was weaker, and to a palindromic TRE (PAL) was stronger than that of wild-type TR beta (TR beta-WT) in the absence of T3.", "subject score": "824" } } }, { "attribute_type_id": "biolink:knowledge_level", "value": "not_provided" }, { "attribute_type_id": "biolink:publications", "value": [ "PMID:8900406", "PMID:8663263", "PMID:9100577", "PMID:10524577", "PMID:11907029", "PMID:18952156" ], "value_type_id": "linkml:Uriorcurie" } ], "object": "MESH:D020218", "predicate": "biolink:has_part", "sources": [ { "resource_id": "infores:semmeddb", "resource_role": "primary_knowledge_source", "upstream_resource_ids": [] }, { "resource_id": "infores:service-provider-trapi", "resource_role": "aggregator_knowledge_source", "upstream_resource_ids": [ "infores:biothings-semmeddb" ] }, { "resource_id": "infores:rtx-kg2", "resource_role": "aggregator_knowledge_source", "upstream_resource_ids": [ "infores:semmeddb" ] }, { "resource_id": "infores:biothings-semmeddb", "resource_role": "aggregator_knowledge_source", "upstream_resource_ids": [ "infores:semmeddb" ] }, { "resource_id": "infores:aragorn", "resource_role": "aggregator_knowledge_source", "upstream_resource_ids": [ "infores:rtx-kg2" ] } ], "subject": "CHEBI:18258" } ```

cbizon commented 3 months ago

Thank you @colleenXu that's very helpful

cbizon commented 3 months ago

Labeled fugu but only for the direct aragorn issues.

sstemann commented 2 months ago

I see all the same warnings in Fugu/Test

image

https://ui.test.transltr.io/main/results?l=DDX3Y%20(Human)&i=NCBIGene:8653&t=2&r=0&q=bcdb73ed-4005-41d8-9bb9-6b3fc1699f16

https://arax.ncats.io/?r=bcdb73ed-4005-41d8-9bb9-6b3fc1699f16

Not sure if there was supposed to be a change for Fugu?

sstemann commented 2 months ago

@cbizon After the cache refresh, still seeing the same errors, warnings, validation info - plus a new error for agent_type

https://ui.test.transltr.io/main/results?l=DDX3Y%20(Human)&i=NCBIGene:8653&t=2&r=0&q=ffc21bae-748f-4534-a6b1-ac914b9f59bb

https://arax.ci.transltr.io/?r=ffc21bae-748f-4534-a6b1-ac914b9f59bb

image
cbizon commented 2 months ago

Two of these errors are likely a reasoner-pydantic issue (assigned above) The other is an omnicorp problem; we might need some guidance from @mbrush about what the right value should be.