weir12 / DENA

Deep learning model used to detect RNA m6a with read level based on the Nanopore direct RNA data.
MIT License
22 stars 5 forks source link

LSTM_extract.py predict 0 reads in bamfile #15

Closed MaestSi closed 1 year ago

MaestSi commented 1 year ago

Dear developers, I'm running dena on a dataset, and I am not able to get any results. In particular, I noticed that this command: python3 /DENA/step4_predict/LSTM_extract.py predict --fast5 FAST5_dir --corr_grp RawGenomeCorrected_000 --bam transcriptome.bam --sites candidate_predict_pos.txt --label "dena_label" --windows 2 2 --processes 1 --debug is producing only *_tmp empty files. I noticed from the standard error that the tool is able to find reads in fast5 but not in bam file, e.g.:

transcript1 pos1-pos2found x reads in fast5
transcript1 pos1-pos2found 0 reads in bamfile

Do you know what may have caused the issue? Thanks in advance, Simone

weir12 commented 1 year ago

Hi: I couldn't figure out exactly what was going on without actual data. In my opinion, anomalies are more data-level than software bugs.

Would you mind doing the following checks? 1.Inspecting the mapping results of BAM file, especially coverage of candidate sites. 2.Is the fasta file that the bam file depends on the same as the tombo resquiggle?

best wishes

MaestSi commented 1 year ago

Hi, I checked that the fasta file used for the alignment is the same used for tombo resquiggling, and the alignment seems ok. By the way, I was able to obtain results for 1 dataset out of 3 (from 3 different organisms) using the exact same code, only with different data and reference file. Best, Simone

weir12 commented 1 year ago

Hi,

If one of the three biological replicates is processed normally, the output is as expected, and the other two are abnormal. My advice and guess is the dataset itself, if you can provide some information about the data (after data anonymization), maybe I can help.

best wishes