rrwick / Badread

a long read simulator that can imitate many types of read problems
GNU General Public License v3.0
150 stars 20 forks source link

How to handle PacBio simulated reads before mapping #31

Open RDorney opened 3 months ago

RDorney commented 3 months ago

Hello @rrwick, I experienced some poor mapping performance with Minimap2 for reads simulated using the pacbio2016 (only changing depth, mean sequence identity, all other settings as default). I noticed it gets progressively worse when I lower the mean sequence identity (go figure), but reads simulated with nanopore2020 models don't experience this problem.

I've been using this command to align my simulated pacbio reads

minimap2 -ax splice:hq -uf 

I'm just wondering if I was meant to input my pacbio2016 reads into the Iso-seq pipeline prior to mapping with Minimap2?

rrwick commented 3 months ago

Hi Ryley,

While the read identity will affect minimap2 alignments, I wouldn't expect the error model (e.g. pacbio2016 vs nanopore2020) to make too much of a difference.

What exactly do you mean by 'poor mapping performance'? Poor computational performance, i.e. it's slow to complete? Or are you getting lots of unaligned reads?

And sorry, but I'm not familiar with IsoSeq, so I'm not sure where that should fit into your process 🤷

Ryan