GoekeLab / m6anet

Detection of m6A from direct RNA-Seq data
https://m6anet.readthedocs.io/
MIT License
103 stars 19 forks source link

Error in m6anet-run_inference step #11

Closed Derryxu closed 2 years ago

Derryxu commented 3 years ago

Hi , Thanks for releasing the new model for m6anet. I have run the nanopolish, data-prep steps successfully but I am encountering error with the m6anet-inference step. Here's the command I used and the error that I got. Could you give me some suggestions about this issue? Thank you very much!

image

There are the output files of the m6anet data-prep output directory. data.index data.json data.log data.readcount eventalign.index A few lines for each file as follows: data.idex transcript_id,transcript_position,start,end cc6m_2244_T7_ecorv,1965,0,130 cc6m_2244_T7_ecorv,1983,130,236 cc6m_2244_T7_ecorv,2030,236,352 cc6m_2459_T7_ecorv,333,352,470 cc6m_2459_T7_ecorv,338,470,578 cc6m_2459_T7_ecorv,419,578,695 cc6m_2459_T7_ecorv,431,695,824 cc6m_2459_T7_ecorv,496,824,942 cc6m_2459_T7_ecorv,565,942,1048 cc6m_2459_T7_ecorv,589,1048,1168 cc6m_2459_T7_ecorv,641,1168,1284 cc6m_2459_T7_ecorv,647,1284,1406 data.json {"cc6m_2244_T7_ecorv":{"1965":{"AGGACTT":[[0.0059211111,3.8680915033,119.6,0.00266,3.98,121.4,0.0066841026,1.6050512821,85.8]]}}} {"cc6m_2244_T7_ecorv":{"1983":{"TGAACCG":[[0.00299,3.341,117.7,0.00465,7.216,90.4,0.00498,4.315,82.8]]}}} {"cc6m_2244_T7_ecorv":{"2030":{"ATAACCA":[[0.003588125,1.84678125,84.1,0.00299,2.173,90.9,0.003386,2.1706,83.0]]}}} {"cc6m_2459_T7_ecorv":{"333":{"GGGACTT":[[0.0078971429,1.9767142857,118.4,0.00531,5.515,119.4,0.00498,2.502,82.8]]}}} {"cc6m_2459_T7_ecorv":{"338":{"TTAACAA":[[0.00664,3.418,89.2,0.00299,1.14,85.6,0.0056425,1.290125,83.1]]}}} {"cc6m_2459_T7_ecorv":{"419":{"AAAACAT":[[0.01494,5.865,102.8,0.01627,4.07,101.1,0.0044143478,1.0262173913,87.8]]}}} {"cc6m_2459_T7_ecorv":{"431":{"CTAACTT":[[0.0080698246,1.8048596491,91.4,0.0136877778,3.0567777778,102.1,0.00232,1.771,91.2]]}}} {"cc6m_2459_T7_ecorv":{"496":{"CGGACCC":[[0.00232,1.516,117.8,0.0059936111,6.8690555556,110.8,0.00332,1.729,71.5]]}}} {"cc6m_2459_T7_ecorv":{"565":{"TTGACAT":[[0.00664,1.587,103.3,0.01461,4.278,108.9,0.00266,2.246,81.8]]}}} {"cc6m_2459_T7_ecorv":{"589":{"ATAACAA":[[0.0029082353,1.64,83.4,0.00232,1.157,93.0,0.0087265347,2.4270792079,86.7]]}}} {"cc6m_2459_T7_ecorv":{"641":{"ATAACTC":[[0.0064280769,1.6935096154,85.6,0.00232,1.491,88.8,0.00299,1.384,85.5]]}}} {"cc6m_2459_T7_ecorv":{"647":{"CAAACTT":[[0.0093738182,2.9810545455,104.1,0.00365,3.917,102.8,0.0049552,0.87816,91.3]]}}}

data.log cc6m_2244_T7_ecorv: Data preparation ... Done. cc6m_2459_T7_ecorv: Data preparation ... Done. data.readcount ‘transcript_id,transcript_position,n_reads cc6m_2244_T7_ecorv,1965,1 cc6m_2244_T7_ecorv,1983,1 cc6m_2244_T7_ecorv,2030,1 cc6m_2459_T7_ecorv,333,1 cc6m_2459_T7_ecorv,338,1 cc6m_2459_T7_ecorv,419,1 cc6m_2459_T7_ecorv,431,1 cc6m_2459_T7_ecorv,496,1 cc6m_2459_T7_ecorv,565,1 cc6m_2459_T7_ecorv,589,1 cc6m_2459_T7_ecorv,641,1 cc6m_2459_T7_ecorv,647,1 ’

eventalign.txt transcript_id,read_index,pos_start,pos_end cc6m_2244_T7_ecorv,0,172,84090 cc6m_2459_T7_ecorv,1,84090,190941

Derryxu commented 3 years ago

Sorry to bother you again.I have found that my fault stems from the improper use of Minimap2. However, I have another problem as shown in the figure below, which seems to be not an error but a warning. Will this affect the forecast? We look forward to receiving your reply, thank you very much!

image

chrishendra93 commented 3 years ago

hi @Derryxu , sorry for the late reply

This will not affect the forecast, do not worry. I will try to incorporate some updates later that will prevent this warning from being shown

Regards

Christopher Hendra

Derryxu commented 3 years ago

Thanks for your reply! Sorry to bother you again. I have another problem with the m6anet-dataprep step. I donnot know how to solve it. The issue is as shown in the figure below. Could you give me some suggestions about this issue? Looking forward to your reply, thank you very much! image

chrishendra93 commented 3 years ago

Hi @Derryxu ,can you provide the first few lines of your eventalign.txt here?

Derryxu commented 3 years ago

Thanks for your reply very much! The first few lines of my eventalign.txt as follows. head eventalign.txt

contig  position        reference_kmer  read_index      strand  event_index     event_level_mean        event_stdv      event_length    model_kmer      model_mean      model_stdv      standardized_level      start_idx       end_idx
chr1    14407   TGCTC   43      t       3516    109.45  2.219   0.01428 GAGCA   107.01  3.02    0.69    7758    7801
chr1    14407   TGCTC   43      t       3515    107.23  1.042   0.00232 GAGCA   107.01  3.02    0.06    7801    7808
chr1    14407   TGCTC   43      t       3514    110.29  2.260   0.01959 GAGCA   107.01  3.02    0.92    7808    7867
chr1    14407   TGCTC   43      t       3513    109.21  1.325   0.00365 GAGCA   107.01  3.02    0.62    7867    7878
chr1    14407   TGCTC   43      t       3512    110.34  1.954   0.01328 GAGCA   107.01  3.02    0.94    7878    7918
chr1    14407   TGCTC   43      t       3511    107.30  2.207   0.00730 GAGCA   107.01  3.02    0.08    7918    7940
chr1    14407   TGCTC   43      t       3510    106.30  1.843   0.00232 GAGCA   107.01  3.02    -0.20   7940    7947
chr1    14407   TGCTC   43      t       3509    107.99  1.709   0.00498 GAGCA   107.01  3.02    0.28    7947    7962
chr1    14407   TGCTC   43      t       3508    107.34  2.370   0.02623 GAGCA   107.01  3.02    0.09    7962    8041
chrishendra93 commented 3 years ago

hi @Derryxu, sorry for the late reply

I have noticed that your reference_kmer column seems to be different from your model_kmer columns. Is this always the case throughout your file? Because if so, m6anet will only select those reads that have model_kmer = reference_kmer and it is likely that the preprocessing ends up with no rows after the filter.

yefei521 commented 2 years ago

hi @Derryxu, I encountered the same problem with m6anet-inference step. Here's the command I used and the error that I got. Could you tell me how do you solve this issue? Thank you very much! (The same command I used in another dataset was no problem) minimap2 -ax map-ont -uf -t 3 --secondary=no Genome_assembly.fasta nanopore_guppy_base.fastq nanopolish eventalign --reads nanopore_guppy_base.fastq --bam guppy_minimap.bam --genome Genome_assembly.fasta --signal-index --scale-events --summary summary.txt --threads 24 > eventalign.txt m6anet-dataprep --eventalign eventalign.txt --out_dir ./ --n_processes 24 m6anet-run_inference --input_dir ./ --out_dir ./ --n_processes 24

image