isovic / graphmap

GraphMap - A highly sensitive and accurate mapper for long, error-prone reads http://www.nature.com/ncomms/2016/160415/ncomms11307/full/ncomms11307.html Note: This was the original repository which will no longer be officially maintained. Please use the new official repository here:
https://github.com/lbcb-sci/graphmap2
MIT License
178 stars 44 forks source link

Sam header when mapping to transcriptome #51

Closed wdecoster closed 7 years ago

wdecoster commented 7 years ago

I tried mapping to the transcriptome, and it seems to have worked. However, while the reads are mapped to the individual chromosomes, the sam header still consists of the transcript IDs. Example:

@HD     VN:1.0  SO:unknown
@SQ     SN:ENST00000456328_1    LN:1657
@SQ     SN:ENST00000515242_1    LN:1653
@SQ     SN:ENST00000518655_1    LN:1483
@SQ     SN:ENST00000450305_1    LN:632
@SQ     SN:ENST00000438504_1    LN:1783

Since the reads have the expected "chromosome" identifiers in the sam fields, this results in many error messages such as printed below when converting the sam file to bam or calling samtools flagstat:

[W::sam_parse1] urecognized reference name; treated as unmapped

isovic commented 7 years ago

Hi, Ugh, I missed that one. I'll fix it soon and let you know. Thank you for the report! Best regards, Ivan.

isovic commented 7 years ago

Sorry it took a bit... Could you try pulling the latest commit and giving it a spin?

Best regards, Ivan.

wdecoster commented 7 years ago

Sam/bam files are okay now! Alignment looks fine. I'm going to rerun with some trimming at the end of the reads, I get the idea that for many reads the end of read is softclipped because of lesser quality at end of sequences/adapter sequence.