genomic-medicine-sweden / gms_16S

A pipeline based on EMU, a taxonomic profiler optimized for long 16S rRNA reads.
GNU General Public License v3.0
5 stars 1 forks source link

Get reads assigned to a particular species #20

Open Danieljaen opened 3 weeks ago

Danieljaen commented 3 weeks ago

I was wondering if it is possible to get a file with only the reads that has been assigned to a particular species in the output. Or, at least, a list of the especific reads that has been assigned to it, so I can extract them myself from the original fastq files afterwards.

fwa93 commented 3 weeks ago

Hi, Right now you can get a list where each read has a sort of probability assignment for each species. Use the flag "-- keep_read_assignments" to get this list. "output .tsv file with read assignment distributions: each row as an input read; each entry as the likelihood it is dervied from that taxa (taxid is the column header); each row sums to 1" https://github.com/treangenlab/emu