bioinfomaticsCSU / deepsignal

Detecting methylation using signal-level features from Nanopore sequencing reads
GNU General Public License v3.0
108 stars 21 forks source link

[Bug] DataLoss error while performing methylation calling using GATC model #70

Closed atger closed 3 years ago

atger commented 3 years ago

DataLossError (see above for traceback): Unable to open table file /mnt/storage3/anand/SO_9407/methylation_calling/models/model.GATC.R9_2D.tem.puc19.bn17.sn360: Data loss: file is too short to be an sstable: perhaps your file is in a different file format and you need to use a different restore operator? [[node save/RestoreV2 (defined at /mnt/storage3/anand/programs/miniconda3/envs/deepsignalenv/lib/python3.6/site-packages/deepsignal-0.1.8-py3.6.egg/deepsignal/call_modifications.py:216) ]]

2021-04-13 17:02:48.652716: W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /mnt/storage3/anand/SO_9407/methylation_calling/models/model.GATC.R9_2D.tem.puc19.bn17.sn360: Data loss: file is too short to be an sstable: perhaps your file is in a different file format and you need to use a different restore operator? 2021-04-13 17:02:48.658680: W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /mnt/storage3/anand/SO_9407/methylation_calling/models/model.GATC.R9_2D.tem.puc19.bn17.sn360: Data loss: file is too short to be an sstable: perhaps your file is in a different file format and you need to use a different restore operator? 2021-04-13 17:02:48.658806: W tensorflow/core/framework/op_kernel.cc:1401] OP_REQUIRES failed at save_restore_tensor.cc:175 : Data loss: Unable to open table file /mnt/storage3/anand/SO_9407/methylation_calling/models/model.GATC.R9_2D.tem.puc19.bn17.sn360: Data loss: file is too short to be an sstable: perhaps your file is in a different file format and you need to use a different restore operator? Process Process-1: Traceback (most recent call last): File "/mnt/storage3/anand/programs/miniconda3/envs/deepsignalenv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call return fn(*args) File "/mnt/storage3/anand/programs/miniconda3/envs/deepsignalenv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn options, feed_dict, fetch_list, target_list, run_metadata) File "/mnt/storage3/anand/programs/miniconda3/envs/deepsignalenv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.DataLossError: Unable to open table file /mnt/storage3/anand/SO_9407/methylation_calling/models/model.GATC.R9_2D.tem.puc19.bn17.sn360: Data loss: file is too short to be an sstable: perhaps your file is in a different file format and you need to use a different restore operator? [[{{node save/RestoreV2}}]]

PengNi commented 3 years ago

The model file path should be /path/to/model.GATC.R9_2D.tem.puc19.bn17.sn360/bn_17.sn_360.epoch_5.ckpt.

Best, Peng

atger commented 3 years ago

Given path as suggested.

Still getting error:

Traceback (most recent call last): File "/mnt/storage3/anand/programs/miniconda3/envs/deepsignalenv/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap self.run() File "/mnt/storage3/anand/programs/miniconda3/envs/deepsignalenv/lib/python3.6/multiprocessing/process.py", line 93, in run self._target(*self._args, **self._kwargs) File "/mnt/storage3/anand/programs/miniconda3/envs/deepsignalenv/lib/python3.6/site-packages/deepsignal-0.1.8-py3.6.egg/deepsignal/call_modifications.py", line 217, in _fast5s_q_to_pred_str_q saver.restore(sess, model_path) File "/mnt/storage3/anand/programs/miniconda3/envs/deepsignalenv/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1268, in restore

On Tue, Apr 13, 2021 at 5:11 PM Peng Ni @.***> wrote:

The model file path should be /path/to/model.GATC.R9_2D.tem.puc19.bn17.sn360/bn_17.sn_360.epoch_5.ckpt.

Best, Peng

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/bioinfomaticsCSU/deepsignal/issues/70#issuecomment-818669367, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEWPKKCLPPUME2PBJT3I3D3TIQUWZANCNFSM423FB2LA .

PengNi commented 3 years ago

I'm not sure that if the last dot was included, but it should NOT be. It should be /path/to/model.GATC.R9_2D.tem.puc19.bn17.sn360/bn_17.sn_360.epoch_5.ckpt, also, --motifs should be set to GATC, and --mod_loc should be set to 1.

Best, Peng

atger commented 3 years ago

Again thrown error:

Traceback (most recent call last): File "/mnt/storage3/anand/programs/miniconda3/envs/deepsignalenv/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap self.run() File "/mnt/storage3/anand/programs/miniconda3/envs/deepsignalenv/lib/python3.6/multiprocessing/process.py", line 93, in run self._target(*self._args, **self._kwargs) File "/mnt/storage3/anand/programs/miniconda3/envs/deepsignalenv/lib/python3.6/site-packages/deepsignal-0.1.8-py3.6.egg/deepsignal/call_modifications.py", line 217, in _fast5s_q_to_pred_str_q saver.restore(sess, model_path) File "/mnt/storage3/anand/programs/miniconda3/envs/deepsignalenv/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1292, in restore err, "a Variable name or other graph key that is missing") tensorflow.python.framework.errors_impl.NotFoundError: Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Key modelem/bw/multi_rnn_cell/cell_0/lstm_cell/bias not found in checkpoint [[node save/RestoreV2 (defined at /mnt/storage3/anand/programs/miniconda3/envs/deepsignalenv/lib/python3.6/site-packages/deepsignal-0.1.8-py3.6.egg/deepsignal/call_modifications.py:216) ]]

Caused by op 'save/RestoreV2', defined at: File "/mnt/storage3/anand/programs/miniconda3/envs/deepsignalenv/bin/deepsignal", line 33, in sys.exit(load_entry_point('deepsignal==0.1.8', 'console_scripts', 'deepsignal')()) File "/mnt/storage3/anand/programs/miniconda3/envs/deepsignalenv/lib/python3.6/site-packages/deepsignal-0.1.8-py3.6.egg/deepsignal/deepsignal.py", line 423, in main args.func(args) File "/mnt/storage3/anand/programs/miniconda3/envs/deepsignalenv/lib/python3.6/site-packages/deepsignal-0.1.8-py3.6.egg/deepsignal/deepsignal.py", line 87, in main_call_mods f5_args) File "/mnt/storage3/anand/programs/miniconda3/envs/deepsignalenv/lib/python3.6/site-packages/deepsignal-0.1.8-py3.6.egg/deepsignal/call_modifications.py", line 407, in call_mods is_rnn, is_base, is_cnn) File "/mnt/storage3/anand/programs/miniconda3/envs/deepsignalenv/lib/python3.6/site-packages/deepsignal-0.1.8-py3.6.egg/deepsignal/call_modifications.py", line 286, in _call_mods_from_fast5s_cpu p.start() File "/mnt/storage3/anand/programs/miniconda3/envs/deepsignalenv/lib/python3.6/multiprocessing/process.py", line 105, in start self._popen = self._Popen(self) File "/mnt/storage3/anand/programs/miniconda3/envs/deepsignalenv/lib/python3.6/multiprocessing/context.py", line 223, in _Popen return _default_context.get_context().Process._Popen(process_obj) File "/mnt/storage3/anand/programs/miniconda3/envs/deepsignalenv/lib/python3.6/multiprocessing/context.py", line 277, in _Popen return Popen(process_obj) File "/mnt/storage3/anand/programs/miniconda3/envs/deepsignalenv/lib/python3.6/multiprocessing/popen_fork.py", line 19, in init self._launch(process_obj) File "/mnt/storage3/anand/programs/miniconda3/envs/deepsignalenv/lib/python3.6/multiprocessing/popen_fork.py", line 73, in _launch code = process_obj._bootstrap() File "/mnt/storage3/anand/programs/miniconda3/envs/deepsignalenv/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap self.run() File "/mnt/storage3/anand/programs/miniconda3/envs/deepsignalenv/lib/python3.6/multiprocessing/process.py", line 93, in run self._target(*self._args, *self._kwargs) File "/mnt/storage3/anand/programs/miniconda3/envs/deepsignalenv/lib/python3.6/site-packages/deepsignal-0.1.8-py3.6.egg/deepsignal/call_modifications.py", line 216, in _fast5s_q_to_pred_str_q saver = tf.train.Saver() File "/mnt/storage3/anand/programs/miniconda3/envs/deepsignalenv/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 832, in init self.build() File "/mnt/storage3/anand/programs/miniconda3/envs/deepsignalenv/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 844, in build self._build(self._filename, build_save=True, build_restore=True) File "/mnt/storage3/anand/programs/miniconda3/envs/deepsignalenv/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 881, in _build build_save=build_save, build_restore=build_restore) File "/mnt/storage3/anand/programs/miniconda3/envs/deepsignalenv/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 513, in _build_internal restore_sequentially, reshape) File "/mnt/storage3/anand/programs/miniconda3/envs/deepsignalenv/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 332, in _AddRestoreOps restore_sequentially) File "/mnt/storage3/anand/programs/miniconda3/envs/deepsignalenv/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 580, in bulk_restore return io_ops.restore_v2(filename_tensor, names, slices, dtypes) File "/mnt/storage3/anand/programs/miniconda3/envs/deepsignalenv/lib/python3.6/site-packages/tensorflow/python/ops/gen_io_ops.py", line 1572, in restore_v2 name=name) File "/mnt/storage3/anand/programs/miniconda3/envs/deepsignalenv/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper op_def=op_def) File "/mnt/storage3/anand/programs/miniconda3/envs/deepsignalenv/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func return func(args, **kwargs) File "/mnt/storage3/anand/programs/miniconda3/envs/deepsignalenv/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3300, in create_op op_def=op_def) File "/mnt/storage3/anand/programs/miniconda3/envs/deepsignalenv/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1801, in init self._traceback = tf_stack.extract_stack()

NotFoundError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Key modelem/bw/multi_rnn_cell/cell_0/lstm_cell/bias not found in checkpoint [[node save/RestoreV2 (defined at /mnt/storage3/anand/programs/miniconda3/envs/deepsignalenv/lib/python3.6/site-packages/deepsignal-0.1.8-py3.6.egg/deepsignal/call_modifications.py:216) ]]

On Tue, Apr 13, 2021 at 7:23 PM Peng Ni @.***> wrote:

I'm not sure that if the last dot was included, but it should NOT be. It should be /path/to/model.GATC.R9_2D.tem.puc19.bn17.sn360/bn_17.sn_360.epoch_5.ckpt, also, --motifs should be set to GATC, and --mod_loc should be set to 1.

Best, Peng

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/bioinfomaticsCSU/deepsignal/issues/70#issuecomment-818754171, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEWPKKG4NYRQBEDRBDVQEKDTIREGFANCNFSM423FB2LA .

PengNi commented 3 years ago

Sorry that I did not notice the version before. GATC model is suited for deepsignal<=0.1.6 as README said. So trying to reinstall deepsignal with a lower version may work.

Best, Peng

atger commented 3 years ago

Older version throwing following error:

Illegal instruction (core dumped)

atger commented 3 years ago

Is it possible to provide GATC model for latest version?

PengNi commented 3 years ago

@atger , thanks very much for your interest of deepsignal. Unfortunately we can not re-train GATC model, cause we have lost the data from signalAlign used for training.

Also, I didn't reproduce the 'Illegal instruction (core dumped)' error in my own test. Here's the environment I used for deepsignal==0.1.6:

Package         Version   
--------------- ----------
absl-py         0.6.1     
astor           0.7.1     
bleach          1.5.0     
certifi         2018.11.29
cycler          0.10.0    
Cython          0.29      
deepsignal      0.1.6     
future          0.17.1    
gast            0.2.0     
grpcio          1.16.1    
h5py            2.8.0     
html5lib        0.9999999 
Jinja2          2.10      
kiwisolver      1.0.1     
mappy           2.14      
Markdown        3.0.1     
MarkupSafe      1.1.0     
matplotlib      3.0.2     
mkl-fft         1.0.6     
mkl-random      1.0.2     
numpy           1.15.4    
ont-tombo       1.5       
pandas          0.23.4    
patsy           0.5.1     
pip             18.1      
protobuf        3.6.1     
pyfaidx         0.5.5.2   
pyparsing       2.3.1     
python-dateutil 2.7.5     
pytz            2018.7    
rpy2            2.9.1     
scikit-learn    0.20.1    
scipy           1.1.0     
setuptools      40.6.2    
six             1.12.0    
statsmodels     0.9.0     
tensorboard     1.8.0     
tensorflow      1.8.0     
termcolor       1.1.0     
tqdm            4.7.2     
Werkzeug        0.14.1    
wheel           0.32.3 

And the command is:

deepsignal call_mods --input_path BJXWZ_SUP.07C123.albacore/workspace/pass/30 --model_path model.GATC.R9_2D.tem.puc19.bn17.sn360/bn_17.sn_360.epoch_5.ckpt --result_file test.GATC.call_mods.txt --reference_path /homeb/nipeng/data/genome/human/GRCh38.primary_assembly.genome.fa --motifs GATC --mod_loc 1 --nproc 40 --corrected_group RawGenomeCorrected_001

Hope that can help you!

Best, Peng

edwwlui commented 3 years ago

Hi Peng,

Thank you so much for the great work and support. I was wondering if you could provide an environment for my reference to run the plasmid model on a GPU please? I attempted different versions of tf, deepsignal but failed to use GPU.

PengNi commented 3 years ago

@edwwlui , Here's the environment (Python==3.6.7) for reference:

Package         Version   
--------------- ----------
deepsignal      0.1.6     
h5py            2.8.0     
numpy           1.15.4    
ont-tombo       1.5       
pandas          0.23.4    
pip             18.1      
scikit-learn    0.20.1    
scipy           1.1.0     
statsmodels     0.9.0     
tensorboard     1.8.0     
tensorflow      1.8.0     

To use GPU, pip install tensorflow-gpu==1.8.0 should be run before installing deepsignal. And the command to run deepsignal should be like:

CUDA_VISIBLE_DEVICES=0 deepsignal call_mods --input_path BJXWZ_SUP.07C123.albacore/workspace/pass/30 --model_path model.GATC.R9_2D.tem.puc19.bn17.sn360/bn_17.sn_360.epoch_5.ckpt --result_file test.GATC.call_mods.txt --reference_path /homeb/nipeng/data/genome/human/GRCh38.primary_assembly.genome.fa --motifs GATC --mod_loc 1 --nproc 40 --corrected_group RawGenomeCorrected_001 --is_gpu yes

Hope this can help!

Best, Peng

edwwlui commented 3 years ago

appreciate it, it works