zhongzhd / ont_m6a_detection

10 stars 2 forks source link

Question about Slide_Variants.py in Epinano #4

Closed kwonej0617 closed 1 year ago

kwonej0617 commented 1 year ago

Hi, I have run your Epinano script using my data. When I run slide_variants.py, it takes a very long time. I am not sure if you also experienced the same issue. If so, could you give me some advice on how you solve this problem? It seems Epinano author recommends splitting a bam file into multiple bam files to reduce processing time for the step. It would be much easier if I could separate the bam file by chromosome. However, because I used transcriptome reference in minimap2, I am not sure how to separate the bam file by transcript id. I am looking forward to hearing from you. Thank you!

zhongzhd commented 1 year ago

Yes, i experienced the same issue. slide_variants.py took me a very long time too. You can try separate the bam file by transcript id, i think it's not a hard work and you can convert the bam file to the sam filewith Samtools and then use Awk to separate them.

kwonej0617 commented 1 year ago

Thank you for your reply. Also, I have a question. It appears you used transcriptome reference to find m6A sites at the transcript level and converted them into the genomic level for evaluation with antibody+ NGS-based methods. I might be wrong, but I wonder if you have specific reasons you didn't find m6a site at the genomic level by using the reference genome? Thank you!

zhongzhd commented 1 year ago

You can find the answer in our published paper, because some tools do not support detect m6A at the genomic level by using the reference genome and we would like to treat all tools equally.

kwonej0617 commented 1 year ago

Got it! Thank you for your reply!