Closed tokebe closed 8 months ago
@colleenXu, is there an expected behavior when an association has qualifier value arrays, but a returned edge querying that association has no qualifiers? This particular case isn't represented in the example query, so this is unfortunately a purely hypothetical question at the moment.
Err....it sounds like there's two issues happening?
1: somehow there are records/associations where a qualifier's value is an array rather than a string
2: It's not clear to me what "association has qualifier value arrays, but a returned edge querying that association has no qualifiers" means.
Is it one of these situations?
I should note that by "association", I mean the association
attribute coming from the smartapi-kg package, which are coming from our own internal representations of smartapi specs, not data that's coming from a KP.
So, what I mean is, the smartapi-kg package gives us an operation, the association
attribute of which specifies the input/output types, predicate, etc...and specifically to this case, gives us a qualifier_type_id that specifies an array of applicable values which the API might be able to respond with. In a (purely hypothetical, as-of-yet not observed case), a KP could respond with no qualifiers, despite this associate specifying the range of applicable qualifiers it should respond with. The KP would be in the wrong, but the code should probably be a little smarter about how it handles it
To give more insight, to my memory, there's no specific reasoning why the Record class should fall back to the association-defined qualifiers if the actual returned edge (from which we're building the Record) has no qualifiers
attribute. So, we could just remove that piece of code and instead drop any would-be records which don't have a qualifier when the association specifies there should be one?
As for where we're seeing an actual error (the "When reversing a record made from such an association" part), that's just a detail of Record class reversing which is totally understood and just needs some fixing, no further discussion is needed.
So....it sounds like none of the scenarios I described above are really what's going on?
So...I'm going to try again at understanding the problem and asking questions:
It now sounds like 1 association
(related to the MetaEdge concept?) can sometimes have 1 qualifier-type-id paired with an array of possible values - and that this is a core part of the problem. Is that right?
association
came from. What do you think?And can you say what KPs these associations
are coming from or give examples of those associations
? I'm imagining that these are maybe only coming from TRAPI KPs:
On "going back to the association because qualifiers aren't in the record": that's why I'm trying to figure out if this is a TRAPI KP issue or not.
And it sounds like the core problem IS NOT "KP response records having no qualifiers when we expected them" (purely hypothetical; I guess it's okay that I still don't quite get this) or "reversing for records"...
Relates to this PR which affects internal metaKG parsing. It looks like this only affects TRAPI KPs, yes.
It now sounds like 1 association (related to the MetaEdge concept?) can sometimes have 1 qualifier-type-id paired with an array of possible values - and that this is a core part of the problem. Is that right?
Yes, seemingly only for TRAPI KPs based on the code/PRs I'm looking at.
And I think that representation would make sense in the entire metaKG...regardless of where the association came from. What do you think?
That was my understanding as well...one operation, one possible qualifier set.
And can you say what KPs these
associations
are coming from or give examples of thoseassociations
? I'm imagining that these are maybe only coming from TRAPI KPs:
An association
is a purely BTE-internal abstraction generated for every single operation it parses. The problematic ones do seem to specifically be coming from TRAPI KPs, meaning either:
/meta_knowledge_graph
endpoints, ORassociation
objects for operations from TRAPI endpoints, and certain qualities of specific operations are causing this improper parsing to occur.I need to isolate which KP is associated with our actual error to try and pinpoint what it looks like when this association
is being made.
On "going back to the association because qualifiers aren't in the record": that's why I'm trying to figure out if this is a TRAPI KP issue or not.
I think this fallback code is specifically there for non-TRAPI KPs, where we insert those qualifiers because, as you said, they're not in what we're getting back from the KP. In which case, we probably won't ever see this code triggered by a TRAPI KP failing to provide qualifiers, so I suppose I don't really need to handle that case.
And it sounds like the core problem IS NOT "KP response records having no qualifiers when we expected them" (purely hypothetical; I guess it's okay that I still don't quite get this) or "reversing for records"...
What remains are two things that I have to do:
association
objects in the first place, so we can actually understand what's going on there and if anything actually needs to be done about it. @rjawesome, though I understand you're busy, if you get a free moment and happen to recall anything that could give us insight into that second point, it'd be greatly appreciated.
Further research to figure out why we were even allowing TRAPI qualifier value arrays in our internal association objects in the first place, so we can actually understand what's going on there and if anything actually needs to be done about it.
Basically, in the meta_knowledge_graph response for any TRAPI API contains an array of qualifiers. Each qualifier has the name of the qualifier and applicable values (plural). This means that the API could accept multiple values for that qualifier name, so we would need to store all of the values in the association.
I suppose if we didn't want to allow arrays then we would have to modify the code from the PR so that it splits into multiple associations for each of the applicable values for a given qualifier.
example from bte meta_knowledge_graph (currently an entry like this would be processed as one association):
"subject": "biolink:SmallMolecule",
"predicate": "biolink:associated_with",
"object": "biolink:PhenotypicFeature",
"qualifiers": [
{
"qualifier_type_id": "biolink:object_aspect_qualifier",
"applicable_values": [
"likelihood"
]
},
{
"qualifier_type_id": "biolink:object_direction_qualifier",
"applicable_values": [
"increased",
"decreased"
]
}
],
"knowledge_types": [
"lookup"
]
Thank you for the clarity @rjawesome, this makes perfect sense! @colleenXu the current code works (aside from the record reversing issue) to handle this just fine, but I'm of two minds whether it should instead be changed to make one association = one qualifier value. What do you think?
Finding some specific examples from TRAPI KPs /meta_knowledge_graph responses:
https://automat.test.transltr.io/ctd/1.4/meta_knowledge_graph Hmmm...so these edges aren't inverses of each other? ``` "subject": "biolink:SmallMolecule", "predicate": "biolink:affects", "object": "biolink:Gene", "qualifiers": [ { "qualifier_type_id": "object_aspect_qualifier", "applicable_values": [ "localization", "activity", "splicing", "expression", "mutation_rate", "secretion", "molecular_modification", "transport", "synthesis", "molecular_interaction", "metabolic_processing", "uptake", "degradation" ] }, { "qualifier_type_id": "object_direction_qualifier", "applicable_values": [ "decreased", "increased" ] }, { "qualifier_type_id": "qualified_predicate", "applicable_values": [ "biolink:causes" ] } ], ``` ``` "subject": "biolink:Gene", "predicate": "biolink:affects", "object": "biolink:SmallMolecule", "qualifiers": [ { "qualifier_type_id": "object_aspect_qualifier", "applicable_values": [ "localization", "activity", "secretion", "molecular_modification", "synthesis", "transport", "molecular_interaction", "metabolic_processing", "uptake", "degradation" ] }, { "qualifier_type_id": "object_direction_qualifier", "applicable_values": [ "decreased", "increased" ] }, { "qualifier_type_id": "qualified_predicate", "applicable_values": [ "biolink:causes" ] } ], ``` ``` "subject": "biolink:Gene", "predicate": "biolink:affects", "object": "biolink:MolecularMixture", "qualifiers": [ { "qualifier_type_id": "object_aspect_qualifier", "applicable_values": [ "activity", "secretion", "transport", "synthesis", "molecular_interaction", "metabolic_processing", "uptake", "degradation" ] }, { "qualifier_type_id": "object_direction_qualifier", "applicable_values": [ "decreased", "increased" ] }, { "qualifier_type_id": "qualified_predicate", "applicable_values": [ "biolink:causes" ] } ], ``` ``` "subject": "biolink:MolecularMixture", "predicate": "biolink:affects", "object": "biolink:Gene", "qualifiers": [ { "qualifier_type_id": "object_aspect_qualifier", "applicable_values": [ "localization", "activity", "expression", "mutation_rate", "secretion", "molecular_modification", "transport", "molecular_interaction", "metabolic_processing", "uptake", "degradation" ] }, { "qualifier_type_id": "object_direction_qualifier", "applicable_values": [ "decreased", "increased" ] }, { "qualifier_type_id": "qualified_predicate", "applicable_values": [ "biolink:causes" ] } ], ``` ``` "subject": "biolink:Gene", "predicate": "biolink:affects", "object": "biolink:ChemicalEntity", "qualifiers": [ { "qualifier_type_id": "object_aspect_qualifier", "applicable_values": [ "localization", "activity", "secretion", "molecular_modification", "transport", "synthesis", "molecular_interaction", "metabolic_processing", "uptake", "degradation" ] }, { "qualifier_type_id": "object_direction_qualifier", "applicable_values": [ "decreased", "increased" ] }, { "qualifier_type_id": "qualified_predicate", "applicable_values": [ "biolink:causes" ] } ], ``` ``` "subject": "biolink:ChemicalEntity", "predicate": "biolink:affects", "object": "biolink:Gene", "qualifiers": [ { "qualifier_type_id": "object_aspect_qualifier", "applicable_values": [ "localization", "activity", "splicing", "expression", "mutation_rate", "secretion", "molecular_modification", "transport", "synthesis", "molecular_interaction", "metabolic_processing", "uptake", "degradation" ] }, { "qualifier_type_id": "object_direction_qualifier", "applicable_values": [ "decreased", "increased" ] }, { "qualifier_type_id": "qualified_predicate", "applicable_values": [ "biolink:causes" ] } ], ``` ``` "subject": "biolink:Gene", "predicate": "biolink:affects", "object": "biolink:ComplexMolecularMixture", "qualifiers": [ { "qualifier_type_id": "object_aspect_qualifier", "applicable_values": [ "metabolic_processing", "molecular_interaction", "uptake" ] }, { "qualifier_type_id": "object_direction_qualifier", "applicable_values": [ "decreased", "increased" ] }, { "qualifier_type_id": "qualified_predicate", "applicable_values": [ "biolink:causes" ] } ], ``` ``` "subject": "biolink:ComplexMolecularMixture", "predicate": "biolink:affects", "object": "biolink:Gene", "qualifiers": [ { "qualifier_type_id": "object_aspect_qualifier", "applicable_values": [ "localization", "activity", "expression", "mutation_rate", "secretion", "molecular_modification", "transport", "molecular_interaction", "metabolic_processing", "degradation" ] }, { "qualifier_type_id": "object_direction_qualifier", "applicable_values": [ "decreased", "increased" ] }, { "qualifier_type_id": "qualified_predicate", "applicable_values": [ "biolink:causes" ] } ], ``` ``` "subject": "biolink:Gene", "predicate": "biolink:affects", "object": "biolink:Polypeptide", "qualifiers": [ { "qualifier_type_id": "object_aspect_qualifier", "applicable_values": [ "transport", "synthesis", "metabolic_processing", "uptake", "degradation" ] }, { "qualifier_type_id": "object_direction_qualifier", "applicable_values": [ "increased" ] }, { "qualifier_type_id": "qualified_predicate", "applicable_values": [ "biolink:causes" ] } ], ``` ``` "subject": "biolink:Polypeptide", "predicate": "biolink:affects", "object": "biolink:Gene", "qualifiers": [ { "qualifier_type_id": "object_aspect_qualifier", "applicable_values": [ "activity", "molecular_interaction", "degradation", "secretion" ] }, { "qualifier_type_id": "object_direction_qualifier", "applicable_values": [ "decreased", "increased" ] }, { "qualifier_type_id": "qualified_predicate", "applicable_values": [ "biolink:causes" ] } ], ```
https://automat.test.transltr.io/pharos/1.4/meta_knowledge_graph ``` "subject": "biolink:MolecularMixture", "predicate": "biolink:affects", "object": "biolink:Gene", "qualifiers": [ { "qualifier_type_id": "object_aspect_qualifier", "applicable_values": [ "activity" ] }, { "qualifier_type_id": "object_direction_qualifier", "applicable_values": [ "decreased", "increased" ] }, { "qualifier_type_id": "qualified_predicate", "applicable_values": [ "biolink:causes" ] } ], ``` ``` "subject": "biolink:SmallMolecule", "predicate": "biolink:affects", "object": "biolink:Gene", "qualifiers": [ { "qualifier_type_id": "object_aspect_qualifier", "applicable_values": [ "activity" ] }, { "qualifier_type_id": "object_direction_qualifier", "applicable_values": [ "decreased", "increased" ] }, { "qualifier_type_id": "qualified_predicate", "applicable_values": [ "biolink:causes" ] } ], ``` ``` "subject": "biolink:ChemicalEntity", "predicate": "biolink:affects", "object": "biolink:Gene", "qualifiers": [ { "qualifier_type_id": "object_aspect_qualifier", "applicable_values": [ "activity" ] }, { "qualifier_type_id": "object_direction_qualifier", "applicable_values": [ "decreased", "increased" ] }, { "qualifier_type_id": "qualified_predicate", "applicable_values": [ "biolink:causes" ] } ], ```
However, many of these seem outside the chem-gene/gene-gene scope of qualifiers right now or don't quite seem to make sense https://automat.test.transltr.io/hetio/1.4/meta_knowledge_graph ``` "subject": "biolink:AnatomicalEntity", "predicate": "biolink:regulates", "object": "biolink:Gene", "qualifiers": [ { "qualifier_type_id": "object_direction_qualifier", "applicable_values": [ "upregulated", "downregulated" ] } ], ``` ``` "subject": "biolink:GrossAnatomicalStructure", "predicate": "biolink:regulates", "object": "biolink:Gene", "qualifiers": [ { "qualifier_type_id": "object_direction_qualifier", "applicable_values": [ "upregulated", "downregulated" ] } ], ``` ``` "subject": "biolink:SmallMolecule", "predicate": "biolink:regulates", "object": "biolink:Gene", "qualifiers": [ { "qualifier_type_id": "object_direction_qualifier", "applicable_values": [ "upregulated", "downregulated" ] } ], ``` ``` "subject": "biolink:Disease", "predicate": "biolink:regulates", "object": "biolink:Gene", "qualifiers": [ { "qualifier_type_id": "object_direction_qualifier", "applicable_values": [ "upregulated", "downregulated" ] } ], ``` ``` "subject": "biolink:ChemicalEntity", "predicate": "biolink:regulates", "object": "biolink:Gene", "qualifiers": [ { "qualifier_type_id": "object_direction_qualifier", "applicable_values": [ "upregulated", "downregulated" ] } ], ``` ``` "subject": "biolink:MolecularMixture", "predicate": "biolink:regulates", "object": "biolink:Gene", "qualifiers": [ { "qualifier_type_id": "object_direction_qualifier", "applicable_values": [ "upregulated", "downregulated" ] } ], ```
https://automat.test.transltr.io/drugcentral/1.4/meta_knowledge_graph ``` "subject": "biolink:SmallMolecule", "predicate": "biolink:affects", "object": "biolink:Protein", "qualifiers": [ { "qualifier_type_id": "object_aspect_qualifier", "applicable_values": [ "activity" ] }, { "qualifier_type_id": "object_direction_qualifier", "applicable_values": [ "decreased", "increased" ] }, { "qualifier_type_id": "qualified_predicate", "applicable_values": [ "biolink:causes" ] } ], ``` ``` "subject": "biolink:ChemicalEntity", "predicate": "biolink:affects", "object": "biolink:Protein", "qualifiers": [ { "qualifier_type_id": "object_aspect_qualifier", "applicable_values": [ "activity" ] }, { "qualifier_type_id": "object_direction_qualifier", "applicable_values": [ "decreased", "increased" ] }, { "qualifier_type_id": "qualified_predicate", "applicable_values": [ "biolink:causes" ] } ], ``` ``` "subject": "biolink:Protein", "predicate": "biolink:affects", "object": "biolink:Protein", "qualifiers": [ { "qualifier_type_id": "object_aspect_qualifier", "applicable_values": [ "activity" ] }, { "qualifier_type_id": "object_direction_qualifier", "applicable_values": [ "decreased", "increased" ] }, { "qualifier_type_id": "qualified_predicate", "applicable_values": [ "biolink:causes" ] } ], ``` ``` "subject": "biolink:MolecularMixture", "predicate": "biolink:affects", "object": "biolink:Protein", "qualifiers": [ { "qualifier_type_id": "object_aspect_qualifier", "applicable_values": [ "activity" ] }, { "qualifier_type_id": "object_direction_qualifier", "applicable_values": [ "decreased", "increased" ] }, { "qualifier_type_id": "qualified_predicate", "applicable_values": [ "biolink:causes" ] } ], ```
https://automat.test.transltr.io/gtex/1.4/meta_knowledge_graph ``` "subject": "biolink:SequenceVariant", "predicate": "biolink:affects", "object": "biolink:Gene", "qualifiers": [ { "qualifier_type_id": "object_aspect_qualifier", "applicable_values": [ "splicing", "expression" ] }, { "qualifier_type_id": "object_direction_qualifier", "applicable_values": [ "decreased", "increased" ] }, { "qualifier_type_id": "qualified_predicate", "applicable_values": [ "biolink:causes" ] } ], ```
https://automat.test.transltr.io/gtopdb/1.4/meta_knowledge_graph ``` "subject": "biolink:SmallMolecule", "predicate": "biolink:affects", "object": "biolink:Gene", "qualifiers": [ { "qualifier_type_id": "object_aspect_qualifier", "applicable_values": [ "activity" ] }, { "qualifier_type_id": "object_direction_qualifier", "applicable_values": [ "decreased", "increased" ] }, { "qualifier_type_id": "qualified_predicate", "applicable_values": [ "biolink:causes" ] } ], ``` ``` "subject": "biolink:ChemicalEntity", "predicate": "biolink:affects", "object": "biolink:Gene", "qualifiers": [ { "qualifier_type_id": "object_aspect_qualifier", "applicable_values": [ "activity" ] }, { "qualifier_type_id": "object_direction_qualifier", "applicable_values": [ "decreased", "increased" ] }, { "qualifier_type_id": "qualified_predicate", "applicable_values": [ "biolink:causes" ] } ], ``` ``` "subject": "biolink:MolecularMixture", "predicate": "biolink:affects", "object": "biolink:Gene", "qualifiers": [ { "qualifier_type_id": "object_aspect_qualifier", "applicable_values": [ "activity" ] }, { "qualifier_type_id": "object_direction_qualifier", "applicable_values": [ "decreased", "increased" ] }, { "qualifier_type_id": "qualified_predicate", "applicable_values": [ "biolink:causes" ] } ], ```
From meetings: Moving forward with changing code to ensure only one value per internal association object.
From what I recall (@tokebe, please comment with agreement or clarification):
A complication to building these "qualifier-sets" is what to do about the potential absence of qualifiers or specific type-ids. For example, what if a TRAPI KP:
@tokebe Linked to this PR https://github.com/biothings/api-respone-transform.js/pull/60? Just in case, I put both this issue and the PR into the Project manager/deployment section...
Relevant changes deployed to Prod.
In some cases, the association object for an edge will have qualifiers with multiple values in an array. When reversing a Record made from such an association, or when falling back to the association in cases where a returned edge is missing those qualifiers, the Record will TypeError as it tries to do string operations on the array.
Similarly, because an association can have qualifier value arrays, falling back to the association qualifiers when the returned edge is missing a qualifier can cause information problems, such as when the array has both
increased
anddecreased
.We need specific expectations for how to handle the fallback, and the code should check for
string
orstring[]
before operating on qualifier values.Reproducible with the following query: