Open sivico26 opened 2 years ago
I tried doing the exact same thing with ONT reads just now. A minimap2
based amplicon read-splitter would be lightning fast.
minimap2 -t 64 -x map-ont primers.fasta ONT.fastq.gz
Unfortunately, I also get no hits using 6 primer sequences of about 21bp as a reference to map to.
Some guidance on mapping reads to a reference made up of especially short primer sequences would be awesome @lh3 !
Oh, I missed this question. You might try
minimap2 -cxsr -t16 -k11 -w5 ONT.fastq.gz primers.fasta
Keep 5'- and 3'-end primer sequences interleaved in primers.fasta
Increase -k
if the command line above is too slow.
Still no hits with:
minimap2 -c -x map-ont -t 60 -k 11 -w 5 primers.fasta barcode91_ITS2.fastq.gz
I uploaded the input files in case you wish to take a look @lh3: https://e.pcloud.link/publink/show?code=kZKsBhZOmbzkJGHex4QEOxJyXhHdBw5BO2k
I can visually see the primer regions in the reads... minimap2
is just not picking them up (marked in red in picture):
@Koen-vdl , were you able to find a resolution to this issue, as I am facing the same issue in some of my analysis workflows. Thanks!
Hi @JDavidson2019,
Unfortunately, I did not find a fix. As a (slow) workaround, I'm currently using the following seqkit
approach:
seqkit fish --threads $threads -q 20 -F ATGCTTAAATTTAGGGGGTAGTC input.fastq.gz &> ITS2_matches.txt
cut -f 1 ITS2_matches.txt | sort -u > ITS2_readIDs.txt
seqkit grep -f ITS2_readIDs.txt input.fastq.gz -o filtered_ITS2.fastq.gz
@Koen-vdl ,
Thank you so much for your response. I tinkered around with it a bit more, and was not able to find a solution either. I got an idea from this paper (https://doi.org/10.1101/2020.08.07.20161737) to use a utility called vsearch
and was able to get alignments for your primers using the following command and your test data:
vsearch -maxaccepts 0 --maxrejects 0 --id 0.75 --strand both --wordlength 5 --minwordmatches 2 --usearch_global primers.fasta --db barcode91_ITS2.fastq.gz --alnout test.aln
The command was able to execute quite quickly, so hopefully this workaround will be useful for you.
Thanks again for your help!
Hi,
I have a particular case and wanted to know If I can address this with
minimap2
. I have some Illumina reads that come from several amplicons and I also have a list of primers that generated those amplicons. In the end, I want to separate the reads by primer.My reasoning is to map each primer to all the reads (something like an all vs all) and then parse the
.paf
to see which primer does each read better match.Nevertheless, I am trying some configurations and get no matches at all. I think the fact that the primers are so short is the main issue, but I do not how to properly configure
minimap2
to handle this.So, do you think this is feasible with
minimap2
? If yes, how should I configure it? If no, would you mind suggesting to me an alternative?Thank you in advance for your thoughts. Cheers