GraphMap - A highly sensitive and accurate mapper for long, error-prone reads http://www.nature.com/ncomms/2016/160415/ncomms11307/full/ncomms11307.html Note: This was the original repository which will no longer be officially maintained. Please use the new official repository here:
Currently, I am exploring options for ONT reads mapping for my company.
I am very impressed by GraphMap's performance, and the article is very helpful.
So thank you for all the amazing work.
What I am stuck right now is replicating the lambda aligning from GraphMap's publication. My mapped bases (coverage) of lambda reads is very different from the article.
Dataset: Directly download from the ref-12. Both the 1d and 2d data. It would be about 4,500X coverage.
Lambda reference: Same with the article. NC_001416 from NCBI
GraphMap version: v0.5.2
Alignment command line: As the article states, all default.
graphmap align -r NC_001416.fasta -d 1d.fastq -o 1d.graphmap.samgraphmap align -r NC_001416.fasta -d 2d.fastq -o 2d.graphmap.samMapped base/coverage calculation: Since the article did not states how the mapped base or coverage was calculated, I just simply use samtools stats without any filter.
Symptom: My coverage is way lower than the article.
Article (specifically, Figure 3b and Supplementary Table-s3):
% bases mapped = 68.1%
Avg. Coverage = 2552.8
My result (detail samtools stats attached):
1d reads:
% bases mapped = 65,689,800/155,370,698 = 42.28%
Avg. Coverage = 65,689,800/48,502 = 1354.37
2d reads:
% bases mapped = 12,783,088/55,854,289 = 22.89%
Avg. Coverage = 12,783,088/48,502 = 263.56
Please let me know if you need me to provide more information.
Thank you ahead.
Luke
1d.graphmap.status.txt
raw total sequences: 29458
filtered sequences: 0
sequences: 29458
is sorted: 0
1st fragments: 29458
last fragments: 0
reads mapped: 11843
reads mapped and paired: 0 # paired-end technology bit set + both mates mapped
reads unmapped: 17615
reads properly paired: 0 # proper-pair bit set
reads paired: 0 # paired-end technology bit set
reads duplicated: 0 # PCR or optical duplicate bit set
reads MQ0: 0 # mapped and MQ=0
reads QC failed: 0
non-primary alignments: 0
total length: 155370698 # ignores clipping
total first fragment length: 155370698 # ignores clipping
total last fragment length: 0 # ignores clipping
bases mapped: 65689800 # ignores clipping
bases mapped (cigar): 65139719 # more accurate
bases trimmed: 0
bases duplicated: 0
mismatches: 32663756 # from NM fields
error rate: 5.014415e-01 # mismatches / bases mapped (cigar)
average length: 5274
average first fragment length: 5274
average last fragment length: 0
maximum length: 110057
maximum first fragment length: 0
maximum last fragment length: 0
average quality: 4.1
insert size average: 0.0
insert size standard deviation: 0.0
inward oriented pairs: 0
outward oriented pairs: 0
pairs with other orientation: 0
pairs on different chromosomes: 0
percentage of properly paired reads (%): 0.0
2d.graphmap.status.txt
raw total sequences: 11094
filtered sequences: 0
sequences: 11094
is sorted: 0
1st fragments: 11094
last fragments: 0
reads mapped: 2227
reads mapped and paired: 0 # paired-end technology bit set + both mates mapped
reads unmapped: 8867
reads properly paired: 0 # proper-pair bit set
reads paired: 0 # paired-end technology bit set
reads duplicated: 0 # PCR or optical duplicate bit set
reads MQ0: 0 # mapped and MQ=0
reads QC failed: 0
non-primary alignments: 0
total length: 55854289 # ignores clipping
total first fragment length: 55854289 # ignores clipping
total last fragment length: 0 # ignores clipping
bases mapped: 12783088 # ignores clipping
bases mapped (cigar): 12017411 # more accurate
bases trimmed: 0
bases duplicated: 0
mismatches: 5832782 # from NM fields
error rate: 4.853610e-01 # mismatches / bases mapped (cigar)
average length: 5034
average first fragment length: 5035
average last fragment length: 0
maximum length: 21158
maximum first fragment length: 0
maximum last fragment length: 0
average quality: 7.1
insert size average: 0.0
insert size standard deviation: 0.0
inward oriented pairs: 0
outward oriented pairs: 0
pairs with other orientation: 0
pairs on different chromosomes: 0
percentage of properly paired reads (%): 0.0
Hi GraphMap team,
Currently, I am exploring options for ONT reads mapping for my company. I am very impressed by GraphMap's performance, and the article is very helpful. So thank you for all the amazing work.
What I am stuck right now is replicating the lambda aligning from GraphMap's publication. My mapped bases (coverage) of lambda reads is very different from the article.
Dataset: Directly download from the ref-12. Both the 1d and 2d data. It would be about 4,500X coverage. Lambda reference: Same with the article. NC_001416 from NCBI GraphMap version: v0.5.2 Alignment command line: As the article states, all default.
graphmap align -r NC_001416.fasta -d 1d.fastq -o 1d.graphmap.sam
graphmap align -r NC_001416.fasta -d 2d.fastq -o 2d.graphmap.sam
Mapped base/coverage calculation: Since the article did not states how the mapped base or coverage was calculated, I just simply usesamtools stats
without any filter.Symptom: My coverage is way lower than the article.
Please let me know if you need me to provide more information. Thank you ahead. Luke
1d.graphmap.status.txt
2d.graphmap.status.txt