Is your feature request related to a problem? Please describe.
We want to be able to perform any analysis based on the studies/specimens already contained within MMEDS. For example, this could look like selecting multiple studies to have analysis run on them together, or doing an analysis on all specimens from humans with IBD.
Describe the solution you'd like
MMEDS should be able to take this in as a query and generate a new study based on the given parameters. So if it's multiple studies, the studies' mapping files would be merged. If it was all humans with IBD, the mapping files of all those samples' various studies would be picked out and merged. After, analysis could be run as normal.
Alternatives we have considered
This above solution includes doing all the demuxing again (in series, no less). That would become rather unwieldy if we had samples from, say, a dozen runs. Is there a way to bypass that?
Important to note
This should be done after the completion of #353 , as getting these from samples that did not use the same kind of upload would probably be finnicky, especially when it came time to figure out demultiplexing.
Additional context
This was originally part of the scope of issue #337 , which was closed in PR #397
Is your feature request related to a problem? Please describe. We want to be able to perform any analysis based on the studies/specimens already contained within MMEDS. For example, this could look like selecting multiple studies to have analysis run on them together, or doing an analysis on all specimens from humans with IBD.
Describe the solution you'd like MMEDS should be able to take this in as a query and generate a new study based on the given parameters. So if it's multiple studies, the studies' mapping files would be merged. If it was all humans with IBD, the mapping files of all those samples' various studies would be picked out and merged. After, analysis could be run as normal.
Alternatives we have considered This above solution includes doing all the demuxing again (in series, no less). That would become rather unwieldy if we had samples from, say, a dozen runs. Is there a way to bypass that?
Important to note This should be done after the completion of #353 , as getting these from samples that did not use the same kind of upload would probably be finnicky, especially when it came time to figure out demultiplexing.
Additional context This was originally part of the scope of issue #337 , which was closed in PR #397