isovic / graphmap

GraphMap - A highly sensitive and accurate mapper for long, error-prone reads http://www.nature.com/ncomms/2016/160415/ncomms11307/full/ncomms11307.html Note: This was the original repository which will no longer be officially maintained. Please use the new official repository here:
https://github.com/lbcb-sci/graphmap2
MIT License
178 stars 44 forks source link

Failed replicate the lambda alignment from the graphmap publication #94

Open Shengpei-Luke-Chen opened 5 years ago

Shengpei-Luke-Chen commented 5 years ago

Hi GraphMap team,

Currently, I am exploring options for ONT reads mapping for my company. I am very impressed by GraphMap's performance, and the article is very helpful. So thank you for all the amazing work.

What I am stuck right now is replicating the lambda aligning from GraphMap's publication. My mapped bases (coverage) of lambda reads is very different from the article.

Dataset: Directly download from the ref-12. Both the 1d and 2d data. It would be about 4,500X coverage. Lambda reference: Same with the article. NC_001416 from NCBI GraphMap version: v0.5.2 Alignment command line: As the article states, all default. graphmap align -r NC_001416.fasta -d 1d.fastq -o 1d.graphmap.sam graphmap align -r NC_001416.fasta -d 2d.fastq -o 2d.graphmap.sam Mapped base/coverage calculation: Since the article did not states how the mapped base or coverage was calculated, I just simply use samtools stats without any filter.

Symptom: My coverage is way lower than the article.

Please let me know if you need me to provide more information. Thank you ahead. Luke

1d.graphmap.status.txt

raw total sequences:    29458
filtered sequences: 0
sequences:  29458
is sorted:  0
1st fragments:  29458
last fragments: 0
reads mapped:   11843
reads mapped and paired:    0   # paired-end technology bit set + both mates mapped
reads unmapped: 17615
reads properly paired:  0   # proper-pair bit set
reads paired:   0   # paired-end technology bit set
reads duplicated:   0   # PCR or optical duplicate bit set
reads MQ0:  0   # mapped and MQ=0
reads QC failed:    0
non-primary alignments: 0
total length:   155370698   # ignores clipping
total first fragment length:    155370698   # ignores clipping
total last fragment length: 0   # ignores clipping
bases mapped:   65689800    # ignores clipping
bases mapped (cigar):   65139719    # more accurate
bases trimmed:  0
bases duplicated:   0
mismatches: 32663756    # from NM fields
error rate: 5.014415e-01    # mismatches / bases mapped (cigar)
average length: 5274
average first fragment length:  5274
average last fragment length:   0
maximum length: 110057
maximum first fragment length:  0
maximum last fragment length:   0
average quality:    4.1
insert size average:    0.0
insert size standard deviation: 0.0
inward oriented pairs:  0
outward oriented pairs: 0
pairs with other orientation:   0
pairs on different chromosomes: 0
percentage of properly paired reads (%):    0.0

2d.graphmap.status.txt

raw total sequences:    11094
filtered sequences: 0
sequences:  11094
is sorted:  0
1st fragments:  11094
last fragments: 0
reads mapped:   2227
reads mapped and paired:    0   # paired-end technology bit set + both mates mapped
reads unmapped: 8867
reads properly paired:  0   # proper-pair bit set
reads paired:   0   # paired-end technology bit set
reads duplicated:   0   # PCR or optical duplicate bit set
reads MQ0:  0   # mapped and MQ=0
reads QC failed:    0
non-primary alignments: 0
total length:   55854289    # ignores clipping
total first fragment length:    55854289    # ignores clipping
total last fragment length: 0   # ignores clipping
bases mapped:   12783088    # ignores clipping
bases mapped (cigar):   12017411    # more accurate
bases trimmed:  0
bases duplicated:   0
mismatches: 5832782 # from NM fields
error rate: 4.853610e-01    # mismatches / bases mapped (cigar)
average length: 5034
average first fragment length:  5035
average last fragment length:   0
maximum length: 21158
maximum first fragment length:  0
maximum last fragment length:   0
average quality:    7.1
insert size average:    0.0
insert size standard deviation: 0.0
inward oriented pairs:  0
outward oriented pairs: 0
pairs with other orientation:   0
pairs on different chromosomes: 0
percentage of properly paired reads (%):    0.0