GoekeLab / m6anet

Detection of m6A from direct RNA-Seq data
https://m6anet.readthedocs.io/
MIT License
100 stars 17 forks source link

m6anet, eventalign, galaxy #155

Open mocherry opened 4 months ago

mocherry commented 4 months ago

Dear m6anet-team,

I have a question regarding m6anet, which I try to use in Galaxy to reproduce some RNA-sequencing study in order to get familiar with this topic. In your manual you write for data-prep: "will need: reads.fastq: fastq file generated from basecalling the raw .fast5 files reads.sorted.bam: sorted bam file obtained from aligning reads.fastq to the reference transcriptome file * transcript.fa: reference transcriptome file"

I was told that with Dorado on fast5/pod5 files, information about modified bases is not preserved in fastq-files and that only in bam-files produced by Dorado this info is contained as a tag.

So, what is the exact pipeline to produce the fastq-files? Will eventalign in Galaxy get the modification from the supplied fast5-files, i.e. is it sufficient to do basecalling (Dorado) with fastq as output option. How do you suggest to produce the sorted bam-file? What would you use for alignment? Minimap?

Please excuse if these questions sound somewhat naive, but I have had a hard time so far getting m6A-info from the sequencing data I have available and am not at all familiar with Python, which is why I want to do the analysis in galaxy.

Thanks for your help and consideration, Matthias

yuukiiwa commented 4 months ago

Hi Matthias (tagging you here @mocherry),

  1. You can pass the --emit-fastq flag to dorado basecaller, which would emit a fastq file, this is sufficient for downstream running nanopolish and m6anet

  2. You can use minimap2 and samtools to get a sorted.bam file:

    minimap2 -ax map-ont -uf -t 3 --secondary=no <MMI> <PATH/TO/FASTQ.GZ> > <PATH/TO/SAM> 2>> <PATH/TO/SAM_LOG>
    samtools view -Sb <PATH/TO/SAM> > <PATH/TO/BAM>
    samtools sort <PATH/TO/BAM> -o <PATH/TO/SORTED.BAM> 
    samtools index <PATH/TO/BAM>
  3. You can then use the fastq file and the fast5 files (or convert the pod5 files to fast5 files with pod5 convert to_fast5 and run nanopolish index

  4. Then, you can run nanopolish eventalign with the fast5, fastq, and sorted.bam, which will give you an eventalign.txt file to input to m6anet dataprep.

Not sure whether you are open to using command line, but you can check out the nf-core/nanoseq, which does all the steps for you.

Thanks!

Best wishes, Yuk Kei

mocherry commented 4 months ago

Hi Yuk Kei,

thanks a lot. I will give it a try. I am not too familiar with command line stuff, so I will look into nf-core/nanoseq and hope that I understand what I have to do there. Maybe I can back with more questions once I have tried and get stuck. Best, Matthias

mocherry commented 4 months ago

Hi Yuk Kei,As you may have seen from my recent posts I still have problems with m6anet.The analysis in Galaxy has no been running for 3 or 4 days with no result, with no error though.I tried to follow your previous suggestion, however get stuck again, even when using the sample data.I guess it has to do with me working in a Windows environment, but alas I have not enough coding skills to solve my problems. What is strange though is that the Galaxy implementation does not work.Now, i am quite desperate and I am at a loss how to analyze our data. Whatever Nanopore offers for Windows is not working either.Can you help?Thanks and best,Matthias Von meinem/meiner Galaxy gesendet -------- Ursprüngliche Nachricht --------Von: Yuk Kei Wan @.> Datum: 06.03.24 22:15 (GMT+01:00) An: GoekeLab/m6anet @.> Cc: mocherry @.>, Mention @.> Betreff: [EXTERN] Re: [GoekeLab/m6anet] m6anet, eventalign, galaxy (Issue #155) Hi Matthias (tagging you here @mocherry),

You can pass the --emit-fastq flag to dorado basecaller, which would emit a fastq file, this is sufficient for downstream running nanopolish and m6anet

You can use minimap2 and samtools to get a sorted.bam file:

minimap2 -ax map-ont -uf -t 3 --secondary=no <PATH/TO/FASTQ.GZ> > <PATH/TO/SAM> 2>> <PATH/TO/SAM_LOG> samtools view -Sb <PATH/TO/SAM> > <PATH/TO/BAM> samtools sort <PATH/TO/BAM> -o <PATH/TO/SORTED.BAM> samtools index <PATH/TO/BAM>

You can then use the fastq file and the fast5 files (or convert the pod5 files to fast5 files with pod5 convert to_fast5 and run nanopolish index

Then, you can run nanopolish eventalign with the fast5, fastq, and sorted.bam, which will give you an eventalign.txt file to input to m6anet dataprep.

Not sure whether you are open to using command line, but you can check out the nf-core/nanoseq, which does all the steps for you. Thanks! Best wishes, Yuk Kei

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>