Closed embatty closed 6 years ago
The main purpose of the extractor is to take raw data and demultiplex by sample barcodes if present and by primer pairs. In your case, the sequences have already been demultiplexed into a pair of file per sample and your interest is to simply split based on the primer pair. But if you have MIDs, you should include them in the barcodes.txt after the primers with a header line starting with id and barcode: id barcode MID01 ACGAGTGCGT MID02 ACGCTCGACA MID03 AGACGCACTC MID04 AGCACTGTAG If there are no MIDs, please follow this tutorial http://baileylab.umassmed.edu/SeekDeep/tutorial_PairedEnd_noMIDs.html
Hi @embatty, Sorry for the delay, I'm currently finishing up Med school and haven't been able to respond right away. I can't see any obvious reason for why the command would have failed. I would need to see what managed to be created in that output directory to get a better idea of what was happening, or if possible if you could send me the sequence files where the command fails, I could likely see right away where things are going wrong. Feel free to share an example file or something over dropbox for me to take a look.
Best, Nick
Actually @embatty, I figured it out, I almost never use SeekDeep extractorPairedEnd
without using the flag --sampleName
to give a name to the files and when I had changed something in the last version I never checked to see if extractor worked without having the --sampleName
flag on. I've just released a new version where I have fixed this issue. Let me know if you have any other problems.
Best, Nick
I am using SeekDeep to extract the reads from some amplicon sequencing. The sequences have already been demultiplexed into a pair of files per sample, and I want to split these into files for each of my primer pairs. I'm running the following command:
SeekDeep extractorPairedEnd --id barcodes.txt --fastq1gz sample.1.fastq.gz --fastq2gz sample.2.fastq.gz --overlapStatusFnp overlapStatus.txt --dout sample
with the barcodes.txt looking like this:
and overlapStatus.txt:
I get the following error:
I think it's correctly checking that I have no MIDs and trying to assign one called "all" but then can't find the file it needs. Do you have any idea where I'm going wrong?