Closed nick-youngblut closed 5 years ago
Hi,
You are correct that this requirement is not currently in the README and this will be added.
The longranger basic software, at least the latest versions, actually do output interleaved fastqs in barcode sorted order (that is all reads assigned a particular barcode are in a single contiguous block).
Are you generating your input reads through some means other than the 10X genomics pipeline?
Thanks! alex
Yes, we are generating our libraries through a custom pipeline. 10X genomics takes too much time, money, etc.
Meaning the reads are not generated through their 10x machines either? Sounds great!
I don't have a script at the moment, but could look into adding something like that once I take care of all other outstanding issues. Have you produced one in the meantime? I used a solution like the one posted here https://www.biostars.org/p/15011/ in the past , which uses the unix paste+sort, but that was with a temporary hack to prepend the barcodes to the query names so that they ended up in barcode-sorted order as well.
Best, alex
Updated the README in commit be4923364853 to specify input fastq must be in barcode-sorted order.
Sorry for the slow reply. To sort the reads by barcode, I moved the barcode to the front of the sequence header, sorted with fastq-sort
and then flipped the barcode back to the end. An example:
cat read1.fq read2.fq | gunzip -c | perl -pe "s/\@(.+) (.+)/\@\$2 \$1/" > TMP.fq && fastq-sort --id TMP.fq | perl -pe "s/\@(.+) (.+)/\@\$2 \$1-1/" > sorted.fq && rm -f TMP.fq
DO you have the exact command line working with an interleaved fastq file output from longranger basic ?
@davidvilanova can you clarify what you mean? You should be able to provide a config.json
file as described in the README if your outputs are produced from longranger. What issues are you having?
I'm running
athena_meta 1.1
, and I got the following error:The
README.rst
states that the reads have to be interleaved, but I don't see anything in the documentation about having to sort reads by barcodes. Given that this is a requirement forathena_meta
, do you have a helper script for sorting the reads by barcode?