hasindu2008 / f5c

Ultra-fast methylation calling and event alignment tool for nanopore sequencing data (supports CUDA acceleration)
https://hasindu2008.github.io/f5c/docs/overview
MIT License
144 stars 26 forks source link

prepare for dRNAseq data preprocessing #184

Closed Seongmin-Jang-1165 closed 2 weeks ago

Seongmin-Jang-1165 commented 3 weeks ago

dear developer

hello i want to preproces dRNAseq data for m6a analysis using m6Anet

i have to use f5c eventalign, so i made [reads.sorted.bam] file like under, as following the instruction for noisy direct RNA seq option in minimap2 manual(https://github.com/lh3/minimap2)

/home/rnagenomics/sm/Nanopore/20240923_Histone_Direct_RNA_seq/minimap2-2.28_x64-linux/minimap2 -t 4 -ax splice -uf -k14 /home/rnagenomics/sm/Nanopore/minimap2/GRCh38.primary_assembly.transcriptome.mmi /home/rnagenomics/sm/Nanopore/20240923_Histone_Direct_RNA_seq/analysis/POD5/Real/basecalling/DORADO_barcode1.bam > /home/rnagenomics/sm/Nanopore/20240923_Histone_Direct_RNA_seq/analysis/minimpa2/Real_Barcoded/for_dRNAseq/minimap2_barcode1.sam

but it do not processed further, and i assume that the option of minimap2 is the reason...

what is the best option of minimap2 for dRNAseq data to run f5c eventalign..?

hasindu2008 commented 3 weeks ago

Hello,

Minimap2 does not support UBAM files as the input [/home/rnagenomics/sm/Nanopore/20240923_Histone_Direct_RNA_seq/analysis/POD5/Real/basecalling/DORADO_barcode1.bam]. You will need to first convert the UBAM file to FASTQ using samtools fastq and then provide that FASTQ to minimap2.

Seongmin-Jang-1165 commented 3 weeks ago

thanks for advice..! I think I may have overlooked it by mistake. it works very well..

i have more question..!

i run f5c eventalign wiht my data for RNA004 cehmistry, but there is an error... could you explain about this problem..?

[code] f5c eventalign --rna -b /home/rnagenomics/sm/Nanopore/20240923_Histone_Direct_RNA_seq/analysis/minimpa2/Real_Barcoded/for_dRNAseq/samtools_barcode2_sort.bam -r /home/rnagenomics/sm/Nanopore/20240923_Histone_Direct_RNA_seq/analysis/POD5/Real/basecalling/fastq/DORADO_barcode2.fastq -g /lustre/home/rnagenomics/sm/Nanopore/20240923_Histone_Direct_RNA_seq/Reference/Reference/gencode.v43.transcripts.fa -o Barcode2_eventalign_mapq3.tsv \ --kmer-model /home/rnagenomics/sm/Nanopore/20240923_Histone_Direct_RNA_seq/analysis/f5c/k_mer_model_RNA004/rna004.nucleotide.5mer.model --slow5 /home/rnagenomics/sm/Nanopore/20240923_Histone_Direct_RNA_seq/analysis/bluecrab/barcode2.blow5 --signal-index --scale-events --min-mapq 3

image

hasindu2008 commented 3 weeks ago
  1. Are you using FAST5 data? If so, did you convert the POD5 to FAST5? I remember another user having a similar issue due to an issue in nanopore's POD5->FAST5 conversion. Can you use https://github.com/Psy-Fer/blue-crab to directly convert POD5->BLOW5. Then, you can perform f5c indexing and eventalignment using this BLOW5 file by specifying as --slow5 /path/to/file.blow5 to f5c index and f5c eventalign.

  2. Can you send the full log messages as a text file? I want to see if there are any warnings that may help debugging this issue.

Seongmin-Jang-1165 commented 3 weeks ago

@hasindu2008

  1. i convert POD5 to blow5 data using Blue-crab...
  2. this is the message of my job. i ran multiple f5C eventalign command. slurm-16666.txt

Thanks for your reply..!

Seongmin-Jang-1165 commented 3 weeks ago

@hasindu2008

hi, i figure out that the index file is made in wrong way....now m6Anet is working well. I'm sorry to bother you when you're busy

hasindu2008 commented 3 weeks ago

No problem. Happy to help. Great you found the issue.