caporaso-lab / mockrobiota

A public resource for microbiome bioinformatics benchmarking using artificially constructed (i.e., mock) communities.
http://mockrobiota.caporasolab.us
BSD 3-Clause "New" or "Revised" License
77 stars 35 forks source link

Potential issue demultiplexing mock-8 #59

Closed wasade closed 7 years ago

wasade commented 7 years ago

I pulled down the forward read and barcode data for mock-8 and attempted to demultiplex it using QIIME2. For load into QIIME2, I'm saving the forward reads as sequences.fastq.qz, and specifying a semantic type of EMPSingleEndSequences. The resulting .qza file is approximately 5GB in size.

If I do not reverse complement the sample barcodes (i.e., default use of q2-demux), the resulting .qza archive is approximately 6kb in size. If I reverse complement the barcodes in the sample metadata file, the resulting .qza is approximately 17MB in size. In the RC case, the samples come out with the following number of sequences:

The readme notes that reverse complement of the mapping file barcodes is necessary, and it does seem to yield more sequence, but the output is substantially smaller than I'd expect given the size of the raw data. I couldn't find expected numbers of sequences in the manuscript cited in the readme, are the above numbers inline with what is expected?

nbokulich commented 7 years ago

Yep, that matches what I have — thanks for checking!

The raw data contain many other non-mock samples for which barcode data are not provided, hence the great size reduction after demux.

wasade commented 7 years ago

Okay, thanks!

On Feb 20, 2017 18:09, "Nicholas Bokulich" notifications@github.com wrote:

Yep, that matches what I have — thanks for checking!

The raw data contain many other non-mock samples for which barcode data are not provided, hence the great size reduction after demux.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/caporaso-lab/mockrobiota/issues/59#issuecomment-281225380, or mute the thread https://github.com/notifications/unsubscribe-auth/AAc8ssqcsIsf0RAbot_6xZ6RYvsSp_c6ks5rekdTgaJpZM4MGxZE .