NCATSTranslator / Feedback

A repo for tracking gaps in Translator data and finding ways to fill them.
7 stars 0 forks source link

Ranking/filtering based on chemical function/role/usage #149

Closed sandrine-m closed 2 months ago

sandrine-m commented 1 year ago

This issue comes from our ferumoxytol use-case: Ferumoxytol has multiple functions/roles/usages (like most of all ferritic encaged nanoparticles):

  1. Treats iron deficiency
  2. Powerful imaging contrast agent
  3. Role in immunotherapy

Those usages/functions/roles are quite distinct. In our use-case, we are interested in narrowing down possible biological mechanisms involved in the use of ferumoxytol in immunotherapy. We are not interested in its use to treat anemia, its role as an imaging agent, nor its general adverse effects (potentially only those killing specific cell types). Is there a way we could filter for a specific usage for this compound and then rerank only those results accordingly?

Additionnaly, could it be possible to filter adverse events based on dosage?

Thanks!

sstemann commented 1 year ago

UI team is researching where they can get this information to incorporate as facets

gprice1129 commented 1 year ago

@sandrine-m is this issue addressed by the ChEBI Role facet?

sandrine-m commented 1 year ago

@gprice1129 I am unable to tell you at the moment: on CI I cannot search for ferumoxytol/Feraheme (Fe3O4) anymore. image

The worse it that the only compound that is proposed now is another ferric oxyde, Iron(iii) Oxide query (Fe2O3) that is not used as contrast agent.

I typed in the field "contrast agent" and was offered to search for pertechnetate (not quite sure why, the matches are not explained). I am waiting for the query to finish (PK: bd81277c-fb42-4d47-bf34-18ff41cc4fef) to see whether we get any label from ChEBI indicating a molecule used for imaging.

sharatisrani commented 1 year ago

Better seen as a UI filtering possibility than a scoring/ordering possibility.

sandrine-m commented 1 year ago

@sharatisrani will the grouping be part of the O&O? It could be useful to have grouping instead of filtering by chemical role (as an example).

sandrine-m commented 1 year ago

RETEST on test: feumoxytol is still absent whereas exist at the name resolver level: image

Persitent on CI

sharatisrani commented 1 year ago

Grouping-organizing is supposed to be a UI issue. In future release, it will create some additional work for O&O team, but for this release we are just returning all the results to the UI.

sandrine-m commented 1 year ago

@sharatisrani given Milestone for fall is it fair to add O&O back?

RETEST: on CI and TEST still persistent as of today. Expecting a fix by next retest.

gprice1129 commented 1 year ago

@sandrine-m it is still not clear to me if the ChEBI role facet is addressing this issue.

sandrine-m commented 1 year ago

@gprice1129 my gut feeling is that the ChEBI roles facet won't resolve the issue but I cannot verify because I cannot search for the compound anymore (see related issue #374). I'll be able to answer you once I'll be able to verify and make a successful retest. I'll keep you posted.

sandrine-m commented 1 year ago

@gprice1129 I'll retest that for you as soon as new name res is alive on CI. To precise my gut feeling, ChEBI role has "MRI contrast agent" :

image

but is down the hierarchy and you may threshold higher the ontology tree and not see it.

sstemann commented 1 year ago

blocked by #374 and #545

sandrine-m commented 1 year ago

Following a discussion with @Genomewide on slack we realized that there were perhaps some misunderstandings about the issues in this ticket. Here are the current issues that remain: 1) on autocomplete issue with ferumoxytol/feraheme: we cannot find this compound anymore 2) on chemical roles: Here is what I am trying to do:

I did not understand at first that ChEBI roles were descriptive of the node but not of the node in a specific context.

I think having enrichments available to users will solve this issue because I have an a priori of what I want (I will perform an enrichment on pathways/biological processes and select only the ones that involves immune function).

It won't solve the issue of filtering chemical/gene functions in particular contexts (same issue with the MVP1 with clinical trials where you do not know for which disease) with no a priori hypothesis.

gprice1129 commented 1 year ago

@sandrine-m we think a call with the development team where you could explain what you're trying to do and we can ask questions would be very helpful.

sandrine-m commented 1 year ago

@gprice1129 yes sure! please feel free to schedule!

Genomewide commented 1 year ago

@sandrine-m We should def talk more about this. We have a couple of possible plans for facets that may address this. Tissue expression, GO and different pathway dbs.

I have been thinking about the enrichment idea and if it is needed. If you have a very specific thought about how to filter things but it is not enough to show up on enrichment, but could with a full list, is that what you would want to see.

Does enrichment have a different use case, like when someone does not know what they want (unlike your case)?

sandrine-muller-research commented 1 year ago

@Genomewide you could surely offer me the GO ontology for biological processes or pathways and let me select from there but it will be fastidious (the GO is huge!) . Perhaps you'll have to figure out a way to propose things that are relevant in that specific example perhaps a bit like the ChEBI facet calculating the overlap but this time on the intermediary nodes? The issue with calculating overlaps only is that you do not know if the overlap is significant given the gene list sizes (what enrichment should somewhat rerank). As a user for this particular example I would like to quickly glance if the JAK/STAT pathway (or any pathway related to this one) is present in results and focus on those ones: calculating the overlap between genes in JAK/STAT or related pathways and the intermediary nodes.

Genomewide commented 1 year ago

@sandrine-muller-research So you are saying enrichment is not what you want here? Because if there is only one gene in the pathway of interest then you would miss that one?

sandrine-muller-research commented 1 year ago

@Genomewide I would love enrichments on intermediary nodes because they are slightly better than crude overlap (mostly if the enrichments are done at the edge level) but I'll be ok with a more basic overlap (e.g. like what is done for the ChEBI facet) as a shorter term solution

Genomewide commented 1 year ago

@sandrine-m from what I am hearing, the enrichment is at the level of the full result set and not an individual result. Is that still as helpful? If so, how would you use it?

sandrine-muller-research commented 1 year ago

@Genomewide It won't be useful. This example is a perfect example of that: there is not 1 unique possible/valid/interesting to investigate... path to go from A to B. Doing enrichment at the whole result level implies that there is a unique path across all which will pool all paths together and reduce overall signal in my opinion. IMO, users are expecting some kind of validation through evidence at the answer level (that also add additional metadata on results such as pathways involved for the MVP2...etc.). We could serve this to the user if we work at the answer level, not at the result level.

Genomewide commented 1 year ago

Ok. I just got confirmation that the current version of enrichment that is being tested will run enrichment on the result nodes. So, maybe there needs to be some discussion about the use case for this. I guess it could be the grouping that Rosina is talking about too.

Genomewide commented 1 year ago

@sandrine-m I reread your original post. Can you restate the pain point for this? Or we should get on a call to discuss.

sandrine-muller-research commented 1 year ago

@Genomewide I get a lot of paths (top) that are related to other usages of the compound (mostly as an imaging agent with lots of adverse events edges) that I do not care about but overload the results. I am not able to retest since a while now because I cannot find the compound anymore (autocomplete issue):

image

so I am not able to give you more details...

sierra-moxon commented 5 months ago

chebi roles have been added as categories to facet on; I am closing as completed, we can certainly revisit (and we are doing so in the O&O group, whether ChEBI or ATC codes are better facet values).