philres / ngmlr

NGMLR is a long-read mapper designed to align PacBio or Oxford Nanopore (standard and ultra-long) to a reference genome with a focus on reads that span structural variations
MIT License
289 stars 40 forks source link

PacBio: ccs/hifi-reads or filtered subreads? #88

Closed MrTomRod closed 3 years ago

MrTomRod commented 3 years ago

I have PacBio RS II raw data: .bax.bam-files. I created two kinds of .fastq-files from them:

Which one should I use for ngmlr?

(In case it's relevant: Afterwards, I want to use Sniffles to call structural variants.)

fritzsedlazeck commented 3 years ago

So you sequenced HiFi reads ? If that is the case you need the collapsed fastq . We typically use the ccs program directly.

If its CLR Pacbio reads then the subreads.fastq

hope the clears this up. Thanks Fritz

MrTomRod commented 3 years ago

The command pls2fasta in.bax.h5 out.fasta -trimByRegion -ccs gave me an error: ERROR, could not initialize file, but without -ccs it works. I suppose that means it's not HiFi reads.

I'm sorry for bothering you, it's the first time I'm working with PacBio data and I've inherited data from an old experiment (2016) that was prepared by somebody who has now left the group. :sweat_smile:

fritzsedlazeck commented 3 years ago

No worries at all! you can map the raw reads also with NGMLR. It doesn't have to be HiFi reads. Thanks Fritz