isovic / graphmap

GraphMap - A highly sensitive and accurate mapper for long, error-prone reads http://www.nature.com/ncomms/2016/160415/ncomms11307/full/ncomms11307.html Note: This was the original repository which will no longer be officially maintained. Please use the new official repository here:
https://github.com/lbcb-sci/graphmap2
MIT License
178 stars 44 forks source link

pacbio parameter set #9

Closed lennythomas closed 8 years ago

lennythomas commented 8 years ago

In the README you state having parameter sets for pacbio, but when I try -x pacbio, I am told they only exist for illumina and oxford. Is there a recommended pacbio set of parameters?

isovic commented 8 years ago

Hello, the default parameters should work well for PacBio as well. I would also recommend trying the option -a anchor as it fixates the more reliable anchors and aligns around them instead of using the semiglobal alignment, but is a bit less sensitive. The only thing that -x illumina does differently is to use the Gotoh alignment (instead of the faster bit-vector alignment). With bit-vector, all scores/penalties are equal to 1 (match score = 1, mismatch = -1, gap_extend = -1, gap_open is not used), while Gotoh allows both custom penalties as well as the gap open penalty. The -x illumina is then equal to using these parameters: -a gotoh -M 5 -X 4 -G 8 -E 6. However, finding the (rough) alignment position does not depend on these parameters, they mostly just affect the base positions.

In short, you can try these possibilities:

graphmap -a anchor -r reference.fa -d reads.fasta -o out.sam
graphmap -a gotoh -M 5 -X 4 -G 8 -E 6 -r reference.fa -d reads.fasta -o out.sam
graphmap -a anchorgotoh -M 5 -X 4 -G 8 -E 6 -r reference.fa -d reads.fasta -o out.sam

or a set of parameters often used by people for aligning nanopore reads (BWA-MEM uses it for PacBio now as well):

-M 1 -X 1 -G 1 -E 1

Determining the 'right' parameters to achieve the best alignment requires experimentation. Should you find any set of parameters that work better than the above suggested, please inform me and I can implement them as a rule similar to -x illumina.

P.S. Could you please point me to where -x pacbio is stated? I'd like to correct it, as in the current version it's not used (it used to be present in an older version).

Best regards, Ivan.