dincarnato / RNAFramework

RNA structure probing and post-transcriptional modifications mapping high-throughput data analysis
http://www.rnaframework.com
GNU General Public License v3.0
31 stars 11 forks source link

rf-wiggle outputs ratios greater than 1. #50

Closed mohamedbahru closed 8 months ago

mohamedbahru commented 8 months ago

Hi, I used rf-wiggle with a flag -r to get the mutation ratio from .rc file.

I found a lot of mutation ratios greater than 1. Shouldn't the mutation ratio values should be between 0 and 1?

In the IGV viewer, I looked at the alignments. I found that those sites that are having very high ratios (>500) are having 0 read coverage in the IGV viewer.

In my case, I have long reads from oxford nanopore sequencing, aligned to the genome (alignment with splicing enabled) using minimap2 aligner. I did the mutation counting using rf-count-genome tool.

Have I done something wrong?

Could you please help me?

Thanks, Mohamed

dincarnato commented 8 months ago

Mohamed,

I doubt the issue is with rf-wiggle. Something is going on with minimap + rf-count. Can you please share with me a sample BAM file that is giving the issue, plus the reference fasta file? I will see what's going on.

Also, keep in mind that IGV doesn't show all the reads. Some reads are not displayed due to certain filters.

Best, Danny

mohamedbahru commented 8 months ago

Hi Danny,

Thanks for your reply. Could you please let me know, how I should send the bam file and the reference fasta file?

Thanks and regards, Mohamed

dincarnato commented 8 months ago

Just send it to me at dincarnato@rnaframework.com

dincarnato commented 8 months ago

Not sure if you sent it but I didn't receive anything yet. You can also use drive.

mohamedbahru commented 8 months ago

Sorry for the late reply. I have sent to your email. Have you received the data?

dincarnato commented 8 months ago

Can you give me the exact command line you are using with rf-count-genome so that I can reproduce the issue?

mohamedbahru commented 8 months ago

rf-count-genome -p 6 -o q_0_eq_0 -ow -f mm39.fa -m -nd -ni -q 0 -eq 0 filtered.bam:u > error.log

dincarnato commented 8 months ago

Ok thanks. Will investigate, but taking a quick look at your BAM file I suspect it might be linked to very large insertions. Can you try to map your reads to the sole transcript and run rf-count instead and see if the issue occurs there as well?

dincarnato commented 8 months ago

Mohamed, can I also get the wiggle file you generated? Thanks, Danny

dincarnato commented 8 months ago

Hi Moahamed,

thanks for your patience. The bug was caused by the multiple gaps in the reads (N cigar operations) that were not properly handled for reads harboring more than one. It should have been fixed now. Can you please try and let me know?

Thanks a lot for reporting this bug! It would have been almost impossible to catch without such a complex example.

All the best, Danny

mohamedbahru commented 8 months ago

Thanks Danny. I will try the updated tool and let you know if there are any issues.

mohamedbahru commented 8 months ago

Thanks Danny for your assistance. The tool works fine now.

dincarnato commented 8 months ago

That's great. Thanks for reporting the bug.

Best, Danny