NCATSTranslator / Tests

3 stars 2 forks source link

Synonymization/generality problems #62

Open dkoslicki opened 3 months ago

dkoslicki commented 3 months ago

For test 7 asset 79, the ARAX result has progesterone as result 2 of 103. But the asset asks for progestin (the synthetic version of progesterone, iirc).

For test 12, asset 309, pegfilgrastim is asked for, but any filgrastim or G-CSF should be accepted.

For test 13, asset 362, the goal is to get ACE inhibitors and ARAX returns Angiotensin-converting enzyme inhibitors (i.e. the full name)

For test case 15, asset 318, Ivacaftor is asked for and ARAX returns Ivacaftor / texacaftor and ivacaftor /lumacaftor as results 2 and 3. Do we want to do the testing post-conflation?

Test 24, asset 396 Butyrate derivatives is asked for, but butyrate and butyric acid aren't accepted as answers.

Test 29, asset 504, thrombin is asked for, but alpha-thrombin is not accepted as an answer (similarly for test 30, asset 595)

Test 35 asset 665, KRASG12D inhibitor MRTX1133 is asked for, but MRTX1133 (CHEMBL.COMPOUND:CHEMBL5081048) is not accepted

dkoslicki commented 3 months ago

Similarly Test 9 asset 623, Rabeprazole is asked for, but Rabeprazole sodium is not accepted as correct. Again with the post/pre-conflation issue

colleenXu commented 3 months ago

Note this issue for Asset 362 / ACE inhibitors: https://github.com/NCATSTranslator/Tests/issues/76.

There's a little more going on with this test. I also link it to the "too general" issue you opened.

colleenXu commented 3 months ago

We noted the same issues for some tests:

colleenXu commented 3 months ago

Here's some more tests with the same kind of problem (I'm closing their separate issues and moving the info here):

asset 32 Lactase `CHEMBL.COMPOUND:CHEMBL2108505`

([old issue](https://github.com/NCATSTranslator/Tests/issues/72)) * the NodeNorm entry for this ID seems odd: it has no other equivalent IDs ([Dev NodeNorm](https://nodenormalization-sri.renci.org/get_normalized_nodes?curie=CHEMBL.COMPOUND:CHEMBL2108505&conflate=true&drug_chemical_conflate=true&description=false)). * There are other lactase entities with more equivalent IDs that show up in BTE's results with higher scores/ranks: [MESH:D043322](https://nodenormalization-sri.renci.org/get_normalized_nodes?curie=MESH:D043322&conflate=true&drug_chemical_conflate=true&description=false) (ChemicalEntity) and [UMLS:C0083183](https://nodenormalization-sri.renci.org/get_normalized_nodes?curie=UMLS:C0083183&conflate=true&drug_chemical_conflate=true&description=false) (Protein) * I've [noted before](https://github.com/NCATSTranslator/Tests/issues/32) that tools tended to fail this test. I haven't done this comparison lately though

asset 355 fosinopril `PUBCHEM.COMPOUND:55891`

[PUBCHEM.COMPOUND:55891](https://nodenormalization-sri.renci.org/get_normalized_nodes?curie=PUBCHEM.COMPOUND:55891&conflate=true&drug_chemical_conflate=true&description=false) ([old issue](https://github.com/NCATSTranslator/Tests/issues/75)) * there's a different NodeNorm entity for FOSINOPRIL SODIUM [(PUBCHEM.COMPOUND:73415812)](https://nodenormalization-sri.renci.org/get_normalized_nodes?curie=PUBCHEM.COMPOUND:23666105&conflate=true&drug_chemical_conflate=true&description=false) with more IDs mapped to it. BTE gives this sodium-variant a higher score/rank (although it's still not in the top 10%). * very similar to the [rabeprazole example above](https://github.com/NCATSTranslator/Tests/issues/62#issuecomment-2195095652)

asset 397 hemin `PUBCHEM.COMPOUND:26945`

[hemin](https://en.wikipedia.org/wiki/Hemin) ([old issue](https://github.com/NCATSTranslator/Tests/issues/81)) * the [NodeNorm](https://nodenormalization-sri.renci.org/get_normalized_nodes?curie=PUBCHEM.COMPOUND:26945&conflate=true&drug_chemical_conflate=true&description=false) labels for this entity's IDs are "protoheme", not hemin * Instead, I see "hemin" in the NodeNorm conflated-ID labels for hematin [PUBCHEM.COMPOUND:455658](https://nodenormalization-sri.renci.org/get_normalized_nodes?curie=PUBCHEM.COMPOUND:455658&conflate=true&drug_chemical_conflate=true&description=false). BTE consistently has hematin in its result set (often too low for a TopAnswer, but still). * It's not clear to me if the conflation of hematin and hemin in NodeNorm is okay or not. [Wikipedia](https://en.wikipedia.org/wiki/Hemin) notes the difference in ions and also has hemin IDs that map to other NodeNorm entities * think all tools have been consistently failing this test, which may be in part due to the answer ID choice: https://github.com/NCATSTranslator/Tests/issues/69

asset 595 thrombin/F2 `UMLS:C0040018`

([old issue](https://github.com/NCATSTranslator/Tests/issues/85)) * While [UMLS:C0040018](https://nodenormalization-sri.renci.org/get_normalized_nodes?curie=UMLS:C0040018&conflate=true&drug_chemical_conflate=true&description=false) is a protein ID, it doesn't have any equivalent IDs. * Instead, I wonder about using the same thrombin IDs that I proposed in https://github.com/NCATSTranslator/Tests/issues/83 : UniProtKB:P00734 or NCBIGene:2147 (primary since we're doing gene-protein conflation). BTE has thrombin/F2 as a top result in its result sets using those IDs. * See https://github.com/NCATSTranslator/Tests/issues/84#issuecomment-2196283625 and https://github.com/NCATSTranslator/Tests/issues/62 for notes on different thrombin NodeNorm entities.

maximusunc commented 3 months ago

So I think this gets to a root question: The Test is asking for a superclass identifier (in most cases) and an ARA gives back a subclass (more specific). Is the ARA expected to also return the superclass? If not, why not?

dkoslicki commented 3 months ago

From the subclass reasoning perspective (if a general concept is asked for, also return more specific instances), my impression was that precision is preferred to generality. But I guess it hasn't really been codified one way or another. My $0.02 is that a subclass being should "count" for the test asking for a superclass. Open to other opinions though

maximusunc commented 3 months ago

I agree that more specific instances are preferred, but does that mean that the general concept that was asked for should not be returned?

If we were to support subclass answers in the tests, how exactly would that work? Is there an easy way to determine if something is a subclass of the expected output?

maximusunc commented 1 month ago

This general topic will be discussed during the testing session of the upcoming Relay.

sandrine-muller-research commented 1 month ago

wrt "wrong output IDs": I guess not all tests are from me but the purpose of the test assets is to test the UI so when the UI returns something wrong (e.g. superclasses) the ID given by the UI is reported.

wrt synonymization/generalization:

colleenXu commented 2 weeks ago

Asset 79 also has this issue. It's been running in sprints 5-6. The answer is "progestin" MESH:D011372, but this is a drug class with ~ 60 individual chemicals. Some of those individual chemicals (including progesterone itself) are high-ranking ARA results and should probably pass...

No one has been passing this test is sprints 5-6 (also noted previously).

This test is also flagged in https://github.com/NCATSTranslator/Tests/issues/93