weir12 / DENA

Deep learning model used to detect RNA m6a with read level based on the Nanopore direct RNA data.
MIT License
22 stars 5 forks source link

DENA Freeze during LSTM_extract.py predict #25

Closed BrendanBeahan closed 1 week ago

BrendanBeahan commented 1 month ago

Hello,

After running tombo re-squiggle, and LSTM_extract.py get_pos I'm now attempting to run LSTM_extract.py predict as follows:

python3 /DENA/step4_predict/LSTM_extract.py predict --fast5 /rhea/scratch/brussel/vo/000/bvo00030/vsc11010/results_mouse_debugging/WT --corr_grp RawGenomeCorrected_000 --bam minimap.filt.sort.1.bam --sites /rhea/scratch/brussel/vo/000/bvo00030/vsc11010/results_mouse_debugging/WT/dena/candidate_predict_pos.txt --label "dena_label" --windows 2 2 --processes 25

However, when I do so the processes stop advancing after reaching a certain point. I do get partially completed tmp files, but usually less than 20%.

Additionally, I request 64 GB of memory on my school's HPC when I do this, and the log also mentions it can not load the BRI module:

INFO:    Environment variable SINGULARITYENV_TMPDIR is set, but APPTAINERENV_TMPDIR is preferred
Warning: The BRI module could not be loaded[12:47:03] Parsing Tombo index file(s).
[12:47:04] Parsing Tombo index file(s).
[12:47:04] Parsing Tombo index file(s).
[12:47:04] Parsing Tombo index file(s).

I'm not quite sure how to proceed. If you had any ideas I would greatly appreciate it.

Alternatively, would it be possible for me to simply split and re-index the BAM file into subsections and to then run extract in chunks, while pointing to the same FAST5 directory?

Best, Brendan

BrendanBeahan commented 1 month ago

Doubling the memory request to 120 GB while reducing the number of threads to 6 resolved the issue.