PengNi / plant_5mC_analysis

9 stars 1 forks source link

Very long calculations without results #1

Closed MuradOmarov closed 3 years ago

MuradOmarov commented 3 years ago

Dear colleagues, I'm impressed with your results described in the article. I would really like to use your pipeline for our results obtained from Arabidopsis thaliana DNA ONT sequencing. I’m trying to use deepsignal-plant with fast5 and base-called by guppy fastq files. These operations I tried to do on our server with 192 CPUs (no gpu) and 500 Gb of RAM. I launched the pipeline with fast5-flies converted to single-fast5 as recommended. So I used the following commands: tombo resquiggle Fast5s-flies/ REF_GENOMES/Genome_arabidopsis.fna --processes 10 --corrected-group RawGenomeCorrected_000 --basecall-group Basecall_1D_000 --overwrite

deepsignal_plant call_mods --input_path Fast5s-flies/ --model_path ./model.dp2.CG.arabnrice2-1_R9.4plus_tem.bn13_sn16.balance.both_bilstm.b13_s16_epoch6.ckpt --result_file fast5s.CG.call_mods.tsv --corrected_group RawGenomeCorrected_000 --reference_path REF_GENOMES/Genome_arabidopsis.fna --motifs CG --nproc 100

The process went on for several hours and then paused. The last notice in nohup was: "call_mods process-230634 ending, proceed 60 fast5s (16 batches)". Then I waited 2 more days, but there were no results. What am I doing wrong?

Thanks in advance!

PengNi commented 3 years ago

Hi @MuradOmarov , thanks very much for your interest of our work. Currently deepsignal-plant and the pre-trained models only work in plant DNA ONT data. RNA ONT data haven't been supported by deepsignal-plant now.

Best, Peng

MuradOmarov commented 3 years ago

I'm very sorry for my misprint. These were DNA reads, not RNA. I edited the issue

PengNi commented 3 years ago

It may be a subprocess communication issue. I will do more tests to see if I can reproduce it. Also, could you print the complete nohup log so I can get more information?

MuradOmarov commented 3 years ago

Ok! Thank you!

# ===============================================
## parameters: 
input_path:
    Fast5s-flies/
model_path:
    model.dp2.CG.arabnrice2-1_R9.4plus_tem.bn13_sn16.balance.both_bilstm.b13_s16_epoch6.ckpt
model_type:
    both_bilstm
seq_len:
    13
signal_len:
    16
layernum1:
    3
layernum2:
    1
class_num:
    2
dropout_rate:
    0
n_vocab:
    16
n_embed:
    4
is_base:
    yes
is_signallen:
    yes
batch_size:
    512
hid_rnn:
    256
result_file:
    fast5s.CG.call_mods.tsv
recursively:
    yes
corrected_group:
    RawGenomeCorrected_000
basecall_subgroup:
    BaseCalled_template
reference_path:
    /home/ikirov/REF_GENOMES/GCF_000001735.4_TAIR10.1_genomic.fna
is_dna:
    yes
normalize_method:
    mad
methy_label:
    1
motifs:
    CG
mod_loc:
    0
f5_batch_size:
    20
positions:
    None
nproc:
    120
nproc_gpu:
    2
# ===============================================
[main]call_mods starts..
15777 fast5 files in total..
parse the motifs string..
read genome reference file..
read position file if it is not None..
call_mods process-235970 starts
call_mods process-235970 ending, proceed 0 fast5s (0 batches)
call_mods process-235086 starts
call_mods process-235086 ending, proceed 0 fast5s (0 batches)
call_mods process-232198 starts
call_mods process-232198 ending, proceed 0 fast5s (0 batches)
call_mods process-233639 starts
call_mods process-233639 ending, proceed 0 fast5s (0 batches)
call_mods process-233695 starts
call_mods process-233695 ending, proceed 0 fast5s (0 batches)
call_mods process-233569 starts
call_mods process-233569 ending, proceed 0 fast5s (0 batches)
call_mods process-234525 starts
call_mods process-234525 ending, proceed 0 fast5s (0 batches)
call_mods process-232921 starts
call_mods process-232921 ending, proceed 0 fast5s (0 batches)
call_mods process-233570 starts
call_mods process-233570 ending, proceed 0 fast5s (0 batches)
call_mods process-234589 starts
call_mods process-234589 ending, proceed 0 fast5s (0 batches)
call_mods process-235150 starts
call_mods process-235150 ending, proceed 0 fast5s (0 batches)
call_mods process-234518 starts
call_mods process-234518 ending, proceed 0 fast5s (0 batches)
call_mods process-234346 starts
call_mods process-234346 ending, proceed 0 fast5s (0 batches)
call_mods process-233737 starts
call_mods process-233737 ending, proceed 0 fast5s (0 batches)
call_mods process-231987 starts
call_mods process-231987 ending, proceed 0 fast5s (0 batches)
call_mods process-232937 starts
call_mods process-232937 ending, proceed 0 fast5s (0 batches)
call_mods process-232195 starts
call_mods process-232195 ending, proceed 0 fast5s (0 batches)
call_mods process-234230 starts
call_mods process-234230 ending, proceed 0 fast5s (0 batches)
call_mods process-232718 starts
call_mods process-232718 ending, proceed 0 fast5s (0 batches)
call_mods process-233804 starts
call_mods process-233804 ending, proceed 0 fast5s (0 batches)
call_mods process-235702 starts
call_mods process-235702 ending, proceed 0 fast5s (0 batches)
call_mods process-235834 starts
call_mods process-235834 ending, proceed 0 fast5s (0 batches)
call_mods process-235972 starts
call_mods process-235972 ending, proceed 0 fast5s (0 batches)
call_mods process-231215 starts
call_mods process-231215 ending, proceed 0 fast5s (0 batches)
call_mods process-233640 starts
call_mods process-233640 ending, proceed 0 fast5s (0 batches)
call_mods process-230630 starts
call_mods process-230630 ending, proceed 0 fast5s (0 batches)
call_mods process-234049 starts
call_mods process-234049 ending, proceed 0 fast5s (0 batches)
call_mods process-235837 starts
call_mods process-235837 ending, proceed 0 fast5s (0 batches)
call_mods process-235045 starts
call_mods process-235045 ending, proceed 0 fast5s (0 batches)
call_mods process-232648 starts
call_mods process-232648 ending, proceed 0 fast5s (0 batches)
call_mods process-233847 starts
call_mods process-233847 ending, proceed 0 fast5s (0 batches)
call_mods process-231985 starts
call_mods process-231985 ending, proceed 0 fast5s (0 batches)
call_mods process-235440 starts
call_mods process-235440 ending, proceed 0 fast5s (0 batches)
call_mods process-235042 starts
call_mods process-235042 ending, proceed 0 fast5s (0 batches)
call_mods process-238707 starts
call_mods process-238707 ending, proceed 0 fast5s (0 batches)
call_mods process-236220 starts
call_mods process-236220 ending, proceed 0 fast5s (0 batches)
call_mods process-233663 starts
call_mods process-233663 ending, proceed 0 fast5s (0 batches)
call_mods process-232068 starts
call_mods process-232068 ending, proceed 0 fast5s (0 batches)
call_mods process-238041 starts
call_mods process-238041 ending, proceed 0 fast5s (0 batches)
call_mods process-236752 starts
call_mods process-236752 ending, proceed 0 fast5s (0 batches)
call_mods process-238443 starts
call_mods process-238443 ending, proceed 0 fast5s (0 batches)
call_mods process-239209 starts
call_mods process-239209 ending, proceed 0 fast5s (0 batches)
call_mods process-235990 starts
call_mods process-235990 ending, proceed 0 fast5s (0 batches)
call_mods process-236605 starts
call_mods process-236605 ending, proceed 0 fast5s (0 batches)
call_mods process-237462 starts
call_mods process-237462 ending, proceed 0 fast5s (0 batches)
call_mods process-238621 starts
call_mods process-238621 ending, proceed 0 fast5s (0 batches)
call_mods process-230830 starts
call_mods process-230830 ending, proceed 20 fast5s (2 batches)
call_mods process-230705 starts
call_mods process-230705 ending, proceed 20 fast5s (3 batches)
call_mods process-234041 starts
call_mods process-234041 ending, proceed 20 fast5s (4 batches)
call_mods process-237895 starts
call_mods process-237895 ending, proceed 0 fast5s (0 batches)
call_mods process-232197 starts
call_mods process-232197 ending, proceed 20 fast5s (4 batches)
call_mods process-230633 starts
call_mods process-230633 ending, proceed 20 fast5s (5 batches)
call_mods process-231191 starts
call_mods process-231191 ending, proceed 20 fast5s (5 batches)
call_mods process-230632 starts
call_mods process-230632 ending, proceed 20 fast5s (5 batches)
call_mods process-233318 starts
call_mods process-233318 ending, proceed 0 fast5s (0 batches)
call_mods process-239145 starts
call_mods process-239145 ending, proceed 20 fast5s (5 batches)
call_mods process-236561 starts
call_mods process-236561 ending, proceed 20 fast5s (4 batches)
call_mods process-238302 starts
call_mods process-238302 ending, proceed 0 fast5s (0 batches)
call_mods process-232715 starts
call_mods process-232715 ending, proceed 0 fast5s (0 batches)
call_mods process-237912 starts
call_mods process-237912 ending, proceed 0 fast5s (0 batches)
call_mods process-239082 starts
call_mods process-239082 ending, proceed 0 fast5s (0 batches)
call_mods process-237237 starts
call_mods process-237237 ending, proceed 0 fast5s (0 batches)
call_mods process-239359 starts
call_mods process-239359 ending, proceed 0 fast5s (0 batches)
call_mods process-230637 starts
call_mods process-230637 ending, proceed 20 fast5s (3 batches)
call_mods process-237272 starts
call_mods process-237272 ending, proceed 0 fast5s (0 batches)
call_mods process-233636 starts
call_mods process-233636 ending, proceed 20 fast5s (4 batches)
call_mods process-237718 starts
call_mods process-237718 ending, proceed 0 fast5s (0 batches)
call_mods process-236383 starts
call_mods process-236383 ending, proceed 20 fast5s (2 batches)
call_mods process-235846 starts
call_mods process-235846 ending, proceed 0 fast5s (0 batches)
call_mods process-230631 starts
call_mods process-230631 ending, proceed 20 fast5s (5 batches)
call_mods process-234317 starts
call_mods process-234317 ending, proceed 20 fast5s (5 batches)
call_mods process-231153 starts
call_mods process-231153 ending, proceed 20 fast5s (5 batches)
call_mods process-239416 starts
call_mods process-239416 ending, proceed 0 fast5s (0 batches)
call_mods process-238888 starts
call_mods process-238888 ending, proceed 0 fast5s (0 batches)
call_mods process-231980 starts
call_mods process-231980 ending, proceed 20 fast5s (6 batches)
call_mods process-231152 starts
call_mods process-231152 ending, proceed 20 fast5s (6 batches)
call_mods process-230831 starts
call_mods process-230831 ending, proceed 20 fast5s (6 batches)
call_mods process-235769 starts
call_mods process-235769 ending, proceed 20 fast5s (6 batches)
call_mods process-231995 starts
call_mods process-231995 ending, proceed 20 fast5s (6 batches)
call_mods process-232646 starts
call_mods process-232646 ending, proceed 20 fast5s (6 batches)
call_mods process-232631 starts
call_mods process-232631 ending, proceed 20 fast5s (6 batches)
call_mods process-234410 starts
call_mods process-234410 ending, proceed 20 fast5s (6 batches)
call_mods process-232193 starts
call_mods process-232193 ending, proceed 20 fast5s (5 batches)
call_mods process-236785 starts
call_mods process-236785 ending, proceed 0 fast5s (0 batches)
call_mods process-235841 starts
call_mods process-235841 ending, proceed 20 fast5s (4 batches)
call_mods process-231665 starts
call_mods process-231665 ending, proceed 20 fast5s (5 batches)
call_mods process-231407 starts
call_mods process-231407 ending, proceed 0 fast5s (0 batches)
call_mods process-237843 starts
call_mods process-237843 ending, proceed 0 fast5s (0 batches)
call_mods process-231589 starts
call_mods process-231589 ending, proceed 20 fast5s (6 batches)
call_mods process-231853 starts
call_mods process-231853 ending, proceed 20 fast5s (7 batches)
call_mods process-235578 starts
call_mods process-235578 ending, proceed 20 fast5s (7 batches)
call_mods process-234779 starts
call_mods process-234779 ending, proceed 20 fast5s (7 batches)
call_mods process-230627 starts
call_mods process-230627 ending, proceed 20 fast5s (7 batches)
call_mods process-234911 starts
call_mods process-234911 ending, proceed 20 fast5s (7 batches)
call_mods process-233806 starts
call_mods process-233806 ending, proceed 20 fast5s (7 batches)
call_mods process-232325 starts
call_mods process-232325 ending, proceed 20 fast5s (7 batches)
call_mods process-230899 starts
call_mods process-230899 ending, proceed 20 fast5s (8 batches)
call_mods process-232927 starts
call_mods process-232927 ending, proceed 20 fast5s (8 batches)
call_mods process-230829 starts
call_mods process-230829 ending, proceed 20 fast5s (8 batches)
call_mods process-231991 starts
call_mods process-231991 ending, proceed 20 fast5s (8 batches)
call_mods process-238953 starts
call_mods process-238953 ending, proceed 20 fast5s (9 batches)
call_mods process-231406 starts
call_mods process-231406 ending, proceed 20 fast5s (9 batches)
call_mods process-235838 starts
call_mods process-235838 ending, proceed 20 fast5s (9 batches)
call_mods process-230626 starts
call_mods process-230626 ending, proceed 20 fast5s (8 batches)
call_mods process-231957 starts
call_mods process-231957 ending, proceed 20 fast5s (9 batches)
call_mods process-232609 starts
call_mods process-232609 ending, proceed 20 fast5s (8 batches)
call_mods process-230628 starts
call_mods process-230628 ending, proceed 20 fast5s (9 batches)
call_mods process-230635 starts
call_mods process-230635 ending, proceed 20 fast5s (11 batches)
call_mods process-232358 starts
call_mods process-232358 ending, proceed 40 fast5s (13 batches)
call_mods process-230801 starts
call_mods process-230801 ending, proceed 40 fast5s (13 batches)
call_mods process-230629 starts
call_mods process-230629 ending, proceed 40 fast5s (15 batches)
call_mods process-237003 starts
call_mods process-237003 ending, proceed 40 fast5s (15 batches)
call_mods process-232585 starts
call_mods process-232585 ending, proceed 40 fast5s (16 batches)
call_mods process-230636 starts
call_mods process-230636 ending, proceed 40 fast5s (18 batches)
call_mods process-231023 starts
call_mods process-231023 ending, proceed 40 fast5s (20 batches)
call_mods process-232584 starts
call_mods process-232584 ending, proceed 40 fast5s (19 batches)
call_mods process-234345 starts
call_mods process-234345 ending, proceed 40 fast5s (19 batches)
call_mods process-230634 starts
call_mods process-230634 ending, proceed 60 fast5s (16 batches)
MuradOmarov commented 3 years ago

After last string appearance in nohup I have been observing only 'deepsignal_plant' working for a three days in the 'top'

PengNi commented 3 years ago

@MuradOmarov , sorry that I couldn't reproduce the issue. deepsignal-plant works well in my server with or without GPU (much slower). What is weird in your log is that subprocesses in deepsignal-plant seems not started parallelly.

PengNi commented 3 years ago

@MuradOmarov , very sorry that the issue was because that the version of deepsignal-plant in github (and pypi) and in our server was unsynchronized. I've updated it in github. You may uninstall and then install the lastest version of deepsignal-plant in your virtual environment:

pip uninstall deepsignal-plant
git clone https://github.com/PengNi/deepsignal-plant.git
cd deepsignal-plant
python setup.py install

Then I think there will be no problem.

Best, Peng

MuradOmarov commented 3 years ago

@PengNi , thank you very much! I've just restarted the process. I'll write the result soon

PengNi commented 3 years ago

@MuradOmarov , you can check the lastest version of deepsignal-plant (commit 373398). The lastest version is much (5x) faster than the previous version in my test without GPU.

Best, Peng

MuradOmarov commented 3 years ago

Cool! It seems to be finished. And I have: call_mods process-244925 starts call_mods process-244925 ending, proceed 2812 batches call_mods process-244926 starts call_mods process-244926 ending, proceed 2500 batches write_process-244931 starts write_process-244931 finished 5545 of 15777 fast5 files failed.. [main]call_mods costs 3221.09 seconds..

Is it normal? Is file fast5s.CG.call_mods.tsv the only one necessary file ? Thank you!

PengNi commented 3 years ago

Hi @MuradOmarov , it seems normal except that failed reads are a little bit more than expected, may due to low quality of the reads.

And yes, fast5s.CG.call_mods.tsv is the only one necessary file to call methylation frequency of cytosines. In our tests, at least 20x coverage of reads are needed to get a stable result.

Best, Peng

MuradOmarov commented 3 years ago

Ok, I got it! Many thanks for the help! And thanks for your cool tool! Best wishes