RTXteam / RTX

Software repo for Team Expander Agent (Oregon State U., Institute for Systems Biology, and Penn State U.)
https://arax.ncats.io/
MIT License
33 stars 21 forks source link

Adjust weights in the ranker #2300

Closed chunyuma closed 2 months ago

chunyuma commented 3 months ago

This is a sub-task of #2299.

mfl15 commented 2 months ago

@dkoslicki blocked due "Downweight text mining provider (need to discuss with @dkoslicki to find an appropriate example, so need to be postponed)"

dkoslicki commented 2 months ago

I think there was a typo there: we don't need to down weight text mining provider, but rather semMedDb. There are many, many examples of that needing down weighted. And most text mining providers results are generally acceptable/good. Does this help unblock it? If not, let me know what I can provide

chunyuma commented 2 months ago

Hi @dkoslicki, could you please help me confirm that is indeed a typo? In your recent reminder, you also mentioned that the text mining provider needs to be down-weighted (downweight semMedDb is an additional requirement)?

dkoslicki commented 2 months ago

Ah, I recall now: Eric had observed that NLP related edges are generally "worse" than ones from curated databases (like DrugBank), but Text Mining Provider is better than SemMedDB. I don't have great suggestions for relative ranking of text mining provider, SemMedDB, and curated edges. Best to do is to use the "top answer" test cases that we looked at during the AHM a week or two ago and see how adjusting the relative ranking affects performance on those tests.

Please excuse my brevity and/or typos; this was sent from my mobile device

—— Associate Professor of Computer Science and Engineering, Biology, and the Huck Institute of the Life Sciences The Pennsylvania State University W205C Westgate Building, University Park, PA 16802 (p) 814-865-1611<tel:+18148651611> / (e) @.**@.>


From: Chunyu Ma @.> Sent: Tuesday, July 9, 2024 6:35:26 AM To: RTXteam/RTX @.> Cc: David Koslicki @.>; Mention @.> Subject: Re: [RTXteam/RTX] Adjust weights in the ranker (Issue #2300)

Hi @dkoslickihttps://github.com/dkoslicki, could you please help me confirm that is indeed a typo? In your recent reminderhttps://github.com/RTXteam/RTX/issues/2299#issuecomment-2206949936, you also mentioned that the text mining provider needs to be down-weighted (downweight semMedDb is an additional requirement)?

— Reply to this email directly, view it on GitHubhttps://github.com/RTXteam/RTX/issues/2300#issuecomment-2217764500, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABQROOHEDSUCODA6MLEPAE3ZLPRJ5AVCNFSM6AAAAABKARIWAWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMJXG43DINJQGA. You are receiving this because you were mentioned.Message ID: @.***>

chunyuma commented 2 months ago

I see @dkoslicki. No worries, in my recent ranking algorithm design, I reduced the importance of all NLP related edges. Hopefully, it can work.

chunyuma commented 2 months ago

@dkoslicki, one more thing that I want to check with you is whether the ranking under inferred/creative mode is totally determined by xDTD but not the new ranking algorithm?

edeutsch commented 2 months ago

I think we want to end up in a state where if we have 4 different answers each with a single edge from 4 different sources, we want something like: 1.0 DrugBank supported edge 0.8 Text mining supported edge 0.7 xDTD edge 0.5 SemMedDB edge

We really want the "gold" edges to stand out as textbook knowledge. And SemMedDB to be something like half the weight.

chunyuma commented 2 months ago

Thanks @edeutsch. This is what I am doing. For the text mining supported edge, do you have any example about how the text mining supported edge looks like? I am looking for any related example.

chunyuma commented 2 months ago

Hi @dkoslicki and @edeutsch,

I would like to get some thoughts about the following situations from you. Please see the comparison results below between the old ranking algorithm and the new ranking algorithm for the test case 22. Since we plan to down-weight the semmeddb and up-weight drugbank, drugcentral supported edges. For Quinidine, which has a drugcentral supported edge, it has been ranked in the top 3 with the new algorithm. However, for Duloxetine and Bupropion, they both have only one semmeddb supported edge, they are down-ranked. But since the total number of results is 28, none of these cases in the new algorithm can pass due to the top 10% requirement.

How do you think? Do you think the new result is expected?

For test case 22, there are 3 assets:

Results from old algorithm :

[1, 'Paroxetine', 1.0] [2, 'Fluoxetine', 0.999] [3, 'Sertraline', 0.929] [4, 'Bupropion', 0.893] [5, '3,4-Methylenedioxymethamphetamine', 0.857] [6, 'Serotonin Uptake Inhibitors', 0.821] [7, 'Duloxetine', 0.82] [8, 'Terbinafine', 0.819] [9, 'Fluvoxamine', 0.714] [10, 'Tamoxifen', 0.713] [11, 'Berberine', 0.712] [12, 'Dextromethorphan', 0.607] [13, 'Metoprolol', 0.606] [14, 'Ketoconazole', 0.536] [15, 'Citalopram', 0.535] [16, 'Antidepressive Agents', 0.534] [17, 'Methadone', 0.533] [18, 'Thioridazine', 0.532] [19, 'Propranolol', 0.531] [20, 'Quinidine', 0.321] [21, 'Venlafaxine', 0.286] [22, 'Risperidone', 0.285] [23, 'Gefitinib', 0.284] [24, 'Haloperidol', 0.283] [25, 'Cimetidine', 0.282] [26, 'alkaloid', 0.281] [27, 'Cannabidiol', 0.071] [28, 'Diacerein', 0.07]

Results from new algorithm:

[1, 'Cannabidiol', 1.0] [2, 'Diacerein', 0.999] [3, 'Quinidine', 0.929] [4, 'Paroxetine', 0.893] [5, 'Fluoxetine', 0.892] [6, 'Sertraline', 0.821] [7, 'Bupropion', 0.786] [8, '3,4-Methylenedioxymethamphetamine', 0.75] [9, 'Serotonin Uptake Inhibitors', 0.714] [10, 'Duloxetine', 0.713] [11, 'Terbinafine', 0.712] [12, 'Fluvoxamine', 0.607] [13, 'Tamoxifen', 0.606] [14, 'Berberine', 0.605] [15, 'Dextromethorphan', 0.5] [16, 'Metoprolol', 0.499] [17, 'Ketoconazole', 0.429] [18, 'Citalopram', 0.428] [19, 'Antidepressive Agents', 0.427] [20, 'Methadone', 0.426] [21, 'Thioridazine', 0.425] [22, 'Propranolol', 0.424] [23, 'Venlafaxine', 0.214] [24, 'Risperidone', 0.213] [25, 'Gefitinib', 0.212] [26, 'Haloperidol', 0.211] [27, 'Cimetidine', 0.21] [28, 'alkaloid', 0.209]

chunyuma commented 2 months ago

Hi @kvnthomas98 or @dkoslicki, can you help me understand why this commit makes sense? If the final score of each result with 'inferred' edge is replaced by the xDTD score, this will mess up the ranking. Is there any reason that this part of code needed to be added?

dkoslicki commented 2 months ago

This was a shoehorn fix in which we found that xDTD results were often being placed way at the bottom since their result graph is a single edge (with a support graph), so the default ranking was low in comparison to lookups which of NGD edges, other edges due to KPs returning additional edges, etc.. The idea was to override the score with the xDTD random forest score, as these are always above 0.8 (or something like that) and so xDTD results wouldn't be relegated to the bottom. If with your changes, the xDTD results aren't always coming in last place, but are rather mixed in with the other lookup results, then this could be removed

dkoslicki commented 2 months ago

@chunyuma the changes you detail in your above comment do seem an improvement. A question though, where are results like Cannabidiol coming from for the new algorithm? Are they "gold standard" sources like Eric mentioned? As a non-SME, I can't immediately see if the top ~10 results make "sense" (save for the known/target ones from the test assets).

But since the total number of results is 28, none of these cases in the new algorithm can pass due to the top 10% requirement.

This shouldn't be too much of a concern, as the testing people are aware of the issue

chunyuma commented 2 months ago

Ah, I see. Thanks for the explanation. Since the new algorithm considers the weights of different edge sources (including xDTD), it should be mixed them. I can try giving higher weights to xDTD if we believe xDTD results with scores above 0.8 are more reliable. Currently, both the text mining supported edge and xDTD edge are given 0.8. But we can consider giving higher weight if needed.

chunyuma commented 2 months ago

@chunyuma the changes you detail in your https://github.com/RTXteam/RTX/issues/2300#issuecomment-2218176853 do seem an improvement. A question though, where are results like Cannabidiol coming from for the new algorithm? Are they "gold standard" sources like Eric mentioned? As a non-SME, I can't immediately see if the top ~10 results make "sense" (save for the known/target ones from the test assets).

If you look at the results from old algorithm here, you can find the Cannabidiol is supported by DrugBank supported edge but other top N results are supported by semmeddb, this indicates the new algorithm relies more on the DrugBank edges.

edeutsch commented 2 months ago

@chunyuma We certainly want the new algorithm to put DrugBank answers at the top.

I looked through your "new algorithm" results for TestCast_22 above and I think they're quite good.

Can you try a comparison of old vs. new comparison of TestCase_13 and see if we improve there? I think that's a good test case. The current top answer is a semmeddb result, not a DrugCentral result. That should change. If we like the results of this comparison, then we might be ready to deploy and see how it goes.

thanks!

chunyuma commented 2 months ago

Hi @edeutsch, please see the comparison of old vs. new algorithm results for TestCase_13 below

For test case 13, there are 4 assets associated with genuinely too low / ranker issue:

Results from old algorithm (only shows the top 50):

[1, 'Angiotensin-Converting Enzyme Inhibitors', 1.0] [2, 'captopril', 0.998] [3, 'Angiogenin', 0.995] [4, 'Angiotensin II', 0.993] [5, 'Bradykinin', 0.991] [6, 'Cilazaprilat', 0.989] [7, 'Aldosterone', 0.988] [8, 'Hydrochlorothiazide', 0.987] [9, 'Amlodipine', 0.986] [10, 'Atenolol', 0.985] [11, 'Digoxin', 0.984] [12, 'Prednisolone', 0.983] [13, 'Rasagiline', 0.982] [14, 'PROLINE', 0.981] [15, 'Verapamil', 0.98] [16, 'Resveratrol', 0.979] [17, 'Nitrate', 0.978] [18, 'Phenylalanine', 0.977] [19, 'Potassium', 0.976] [20, 'Propranolol', 0.975] [21, 'Glycine', 0.974] [22, 'Histidine', 0.973] [23, 'Indapamide', 0.972] [24, 'Zofenoprilat', 0.971] [25, 'garlic allergenic extract 50 MG/ML', 0.97] [26, 'Timonacic', 0.969] [27, 'Triethylenemelamine', 0.968] [28, 'Fozitec', 0.939] [29, 'Candesartan', 0.937] [30, 'Losartan', 0.934] [31, 'Omapatrilat', 0.933] [32, 'Renin', 0.932] [33, 'Clopidogrel', 0.931] [34, 'Carvedilol', 0.93] [35, 'Hydralazine', 0.929] [36, 'Metoprolol', 0.928] [37, 'Paricalcitol', 0.927] [38, 'Valine', 0.926] [39, 'Furosemide', 0.925] [40, 'Nitric oxide', 0.924] [41, 'Nitroglycerin', 0.923] [42, 'Nifedipine', 0.922] [43, 'Fosinoprilat', 0.921] [44, 'Leucine', 0.92] [45, '(2R,3S,4R,5R)-2,3,4,5,6-pentahydroxyhexanal', 0.919] [46, 'l-Isoleucine', 0.918] [47, 'L-Thioproline', 0.917] [48, '(-)-Epigallocatechin gallate', 0.916] [49, 'Calcium', 0.915] [50, 'Lysine', 0.914] ... [118, 'Moexipril', 0.735] ... [190, 'Trandolapril', 0.571] [191, 'Benazepril', 0.569] ... [428, 'Fosinopril', 0.032]

Results from new algorithm (only shows the top 50):

[1, 'Enalapril', 1.0] [2, 'Captopril', 0.999] [3, 'Lisinopril', 0.998] [4, 'Perindopril', 0.997] [5, 'Cilazapril', 0.996] [6, 'Spirapril', 0.995] [7, 'Enalaprilat', 0.994] [8, 'Trandolapril', 0.993] [9, 'Benazepril', 0.992] [10, 'Moexipril', 0.991] [11, 'Zofenopril', 0.99] [12, 'Ramipril', 0.975] [13, 'Quinapril', 0.974] [14, 'Temocapril', 0.971] [15, 'Imidapril', 0.968] [16, 'Perindoprilat', 0.966] [17, 'Ramiprilat', 0.965] [18, 'Quinaprilat', 0.964] [19, 'Fosinopril', 0.963] [20, 'Fosinopril sodium', 0.962] [21, 'Trandolaprilat', 0.961] [22, 'Fasidotril', 0.952] [23, 'Imidaprilat', 0.951] [24, 'Spirapril hydrochloride', 0.95] [25, 'Perindopril arginine', 0.949] [26, '[(2S)-2-(1,3-benzodioxol-5-ylmethyl)-3-oxo-3-[[(2S)-1-oxo-1-phenylmethoxypropan-2-yl]amino]propyl]sulfanylformic acid', 0.948] [27, 'Spiraprilat', 0.947] [28, 'Lisinopril monohydrate', 0.946] [29, 'Benazeprilat', 0.937] [30, 'Moexipril hydrochloride', 0.936] [31, 'Enalapril maleate', 0.935] [32, 'Quinapril hydrochloride', 0.934] [33, 'Perindopril erbumine', 0.933] [34, 'Benazepril hydrochloride', 0.932] [35, 'Reserpinine', 0.923] [36, 'Delapril', 0.921] [37, 'Angiotensin II', 0.918] [38, 'Bradykinin', 0.917] [39, 'Cilazaprilat', 0.916] [40, 'Losartan', 0.915] [41, 'Omapatrilat', 0.914] [42, 'Hydrochlorothiazide', 0.913] [43, 'Amlodipine', 0.912] [44, 'Candesartan', 0.911] [45, 'Atenolol', 0.91] [46, 'Digoxin', 0.909] [47, 'Prednisolone', 0.908] [48, 'Valsartan', 0.907] [49, 'Rasagiline', 0.906] [50, 'Telmisartan', 0.905]

dkoslicki commented 2 months ago

Very nice! And all the other top 10-20 are all ACE inhibitors, so this is looking quite good!

edeutsch commented 2 months ago

I think this is a sensational advance! I suggest checking it in to master and getting it deployed in time for the next test run.

Thanks!

chunyuma commented 2 months ago

Close this issue as a completed sub-task of #2299. Please see #2299.