Open lukepereira opened 4 months ago
@lukepereira can you check in with Ethan on what will be needed to get this online? I think he's really close with his lambda implementation.
[I'll ask Ethan for his GH account to get him here as well]
https://github.com/declanlim/mwas
mwas repo (the readme is very outdated)
@finesden33 Can you update the README and what's the status on the API MWAS being online?
I added a prototype of the MWAS plot since I think it could be useful when combined with disease filters. It's currently using pre-computed MWAS results with virus families, I found that results can be fetched fairly quickly using s5cmd. You can view it by clicking 'Advanced' in the Virome section.
In the future, it would be nice to support running different user defined MWAS jobs using Ethan's lambda. Would need to clarify, but i think we can define the target set as the current query and the background set as all runs in matched bioprojects (?). We also likely want to re-run this workload using ethan's updated code to resolve some bugs (1, 3, 5).
Some limitations of the existing approach:
I think this feature would be pretty useful and would likely encourage users to return to the site.
If we're concerned about costs, we can only allow calls for small-medium MWAS jobs and make it unrestricted for us to run internally. It also seems possible that MWAS on the virome could look up values in Declan's pre-computed rfam.
possible plots:
Another thing to explore would be combining the v-enrichment scores with pre-computed values on rfam. i.e. find "important" sOTUs in a virome, map them to a list of rfam/biosamples, then look up pre-computed pvalues in s3