zyndagj / BSMAPz

Updated and optimized fork of BSMAP
Other
22 stars 6 forks source link

methratio.py 'S not a valid CIGAR character' #15

Open mdraminski opened 4 years ago

mdraminski commented 4 years ago

When I run this script on my bam file I got ValueError and script ends with the comment: 'S not a valid CIGAR character'. The script only operates on M|I|D characters and crashes on any other however S|H|X are also allowed by sam file format. https://drive5.com/usearch/manual/cigar.html I did some prints inside and my seq and cigar looks like below e.g. seq: ATCCCAACAACACTCCAACCTCAACATAAACCAACCCCAACATAAACCAACCCCAACATAAACCTACCTCAACAT cigar: 43M32S

zyndagj commented 4 years ago

You are correct, my methratio.py parser should be handling those characters.

https://github.com/zyndagj/BSMAPz/blob/master/methratio.py#L431

I'll work on modifying that script to correctly support those additional cases after the new year.

mdraminski commented 4 years ago

I can implement the fix however I am not so sure how to treat them. So far for my own purpose I ignore them but do not think it is a best solution. What do you think about these unmatched readings? Sorry for mess with reopening.

zyndagj commented 4 years ago

The original version ignored them and assumed they only existed at the end of a read. I don't currently have time to implement a fix, but am happy to accept a pull request if you want to give it a try.

ChemaMD commented 4 years ago

Hi, sorry to bother with this issue, but I'm finding the same problem. I do not unfortunately have the knowledge to implement the fix. Is there any plan to update the script in the near future? I'm aware this is probably not the best moment at all given the covid-19 crisis, but just asking. Apologies and thank you very much for the work to update BSMAP and help.

magnusdv commented 2 years ago

Hi, any updates on this issue? I'm running into the same problem. Thanks for keeping BSMAP alive!