nf-core / raredisease

Call and score variants from WGS/WES of rare disease patients.
https://nf-co.re/raredisease
MIT License
84 stars 34 forks source link

https://ai.marrvel.org / how to integrate to Scout and our analysis? #560

Open fulyataylan opened 4 months ago

fulyataylan commented 4 months ago

Hello developers,

MARRVEL has released a new tool called MARRVELAI. This tool integrates phenotype data to prioritize variants, a feature currently missing in Scout's variant ranking. MARRVELAI can also extract phenotype terms from journal text, which could be useful for extracting HPO terms from extensive clinical descriptions. You can find more information at https://ai.marrvel.org/.

It may be worthwhile to explore if this tool can be beneficial to us and whether it can be integrated with the Scout and Tip2Toe phenotyping tools.

Thanks, Fulya

dnil commented 4 months ago

The software is available https://github.com/LiuzLab/AI_MARRVEL under GPL-3 and they have an s3 with data for it, but with the not uncommon "free for research, contact us for business". While many of our users are researchers, the clinical part is sometimes ambiguous in these models - someone who is less annoyed by stuff like that should probably investigate. 😁 It also remains to be seen what can be done with our current division of compute; they appear to have variant scores, but they are somewhat weighted by phenotype and standard runs would involve providing HPO-terms upfront.

jemten commented 4 months ago

Interesting tool. We could look into integrating it but ideally I think you should be able to launch this from scout as well. With the current setup the HPO terms needed would have to be included in the order and passed to the pipeline. We could do a small pilot and test it.

dnil commented 4 months ago

Right, and that would then presumably involve connecting more compute resources / queues to it. While it may or may not be worth it for this one, I am guessing a chunk of the other newer "AI" tools will also try to use phenotype information as input. It makes for an easier genetics problem.

jemten commented 4 months ago

yup, don't know how beefy the current VMs are but it sounds like we would need to connect to a more dedicated compute resource, be it cloud or local.

dnil commented 4 months ago

Precisely; no way with the current config - essentially all outward facing microservices are on the same (somewhat decently sized, but still) VM. We would need to be able to queue jobs in one form or the other.