Open lina-kim opened 10 months ago
This is totally possible now with Collection[...]
as an output. Is this something you would be interested in working on @lina-kim?
Great to know, thanks @ebolyen! Yes, I would be more than happy to work on it. Is Collection[...]
a semantic type found in q2-types
/ the base QIIME 2 installation? I'm not seeing much documentation for it on first glance.
Hey @lina-kim, I am actually working on some tutorial content that includes Collection
right now. You can see the working draft here. Note that you'll only be able to access this tutorial page through this like as it's built from a pull-request (so you won't find this content if you navigate from https://develop.qiime2.org yet). This link will also break once the corresponding PR is merged.
You can also find the new API docs on Collection
here.
Want to take a look at that and let us know if you have questions about how to use Collection
?
Perfect, thanks for the resources @gregcaporaso! I'll check them out and get back to you with any questions.
Addition Description It would be useful to bin reads by primer prior to primer removal. I'd like to separate a single FASTQ-based artifact (containing several different primers) into multiple output artifacts by primer; each output artifact would be characterized by a single primer. This would be helpful for meta-analyses in which sequences with multiple primers/variable regions may be found in a single QIIME artifact.
This is possible with native Cutadapt (as of
v4.5
) using steps to demultiplex, but not in the QIIME 2 plugin as its inputs are restricted to specific semantic types.Current Behavior
qiime cutadapt demux
(based on adapter sequence), but generates only a single output for demultiplexed sequences. It also requires an input artifact of typeMultiplexedSingleEndBarcodeInSequence
and does not acceptSampleData[Single/PairedEndSequencesWithQuality]
.qiime cutadapt trim
could technically perform this by running the command once per primer (pair), but that is quite inefficient.Proposed Behavior
q2-cutadapt
would take as input 1) a FASTQ artifact ofSampleData[Single/PairedEndSequencesWithQuality]
, which contains N different primer sequences among its many reads, and 2) a tab-separated metadata file containing the N primer names and corresponding primer sequences.SampleData[Single/PairedEndSequencesWithQuality]
; each output artifact would contain reads of the same primer sequence. There would also be an output artifact (alsoSampleData[Single/PairedEndSequencesWithQuality]
) of sequences that did not have any of the N primer names.Questions
References
qiime cutadapt trim-paired