bioinfomaticsCSU / deepsignal

Detecting methylation using signal-level features from Nanopore sequencing reads
GNU General Public License v3.0
108 stars 21 forks source link

Error when using newly trained model #47

Closed pterzian closed 3 years ago

pterzian commented 3 years ago

Hi Peng,

I had the following error when using my last custom models for calling methylation :

2020-07-15 10:42:05.610543: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at save_restore_v2_ops.cc:184 : Not found: Key BDGRU_rnn/bw/multi_rnn_cell/cell_0/lstm_cell/bias not found in 
checkpoint
Process Process-2:
Traceback (most recent call last):
  File "/usr/local/bioinfo/src/DeepSignal/deepsignal-0.1.6/deepsignal-0.1.6_venv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1322, in _do_call
    return fn(*args)
  File "/usr/local/bioinfo/src/DeepSignal/deepsignal-0.1.6/deepsignal-0.1.6_venv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1307, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "/usr/local/bioinfo/src/DeepSignal/deepsignal-0.1.6/deepsignal-0.1.6_venv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1409, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.NotFoundError: Key BDGRU_rnn/bw/multi_rnn_cell/cell_0/lstm_cell/bias not found in checkpoint
     [[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/devic
e:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/tools/python/3.6.3/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/tools/python/3.6.3/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/bioinfo/src/DeepSignal/deepsignal-0.1.6/deepsignal-0.1.6_venv/lib/python3.6/site-packages/deepsignal/call_modifications.py", line 168, in _call_mods_q
    saver.restore(sess, model_path)
  File "/usr/local/bioinfo/src/DeepSignal/deepsignal-0.1.6/deepsignal-0.1.6_venv/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1802, in restore
    {self.saver_def.filename_tensor_name: save_path})
  File "/usr/local/bioinfo/src/DeepSignal/deepsignal-0.1.6/deepsignal-0.1.6_venv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 900, in run
    run_metadata_ptr)
  File "/usr/local/bioinfo/src/DeepSignal/deepsignal-0.1.6/deepsignal-0.1.6_venv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1135, in _run
    feed_dict_tensor, options, run_metadata)
  File "/usr/local/bioinfo/src/DeepSignal/deepsignal-0.1.6/deepsignal-0.1.6_venv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1316, in _do_run
    run_metadata)
  File "/usr/local/bioinfo/src/DeepSignal/deepsignal-0.1.6/deepsignal-0.1.6_venv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1335, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.NotFoundError: Key BDGRU_rnn/bw/multi_rnn_cell/cell_0/lstm_cell/bias not found in checkpoint
     [[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/devic
e:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

Caused by op 'save/RestoreV2', defined at:
  File "/usr/local/bioinfo/src/DeepSignal/deepsignal-0.1.6/deepsignal-0.1.6_venv/bin/deepsignal", line 11, in <module>
    sys.exit(main())
  File "/usr/local/bioinfo/src/DeepSignal/deepsignal-0.1.6/deepsignal-0.1.6_venv/lib/python3.6/site-packages/deepsignal/deepsignal.py", line 353, in main
    args.func(args)
  File "/usr/local/bioinfo/src/DeepSignal/deepsignal-0.1.6/deepsignal-0.1.6_venv/lib/python3.6/site-packages/deepsignal/deepsignal.py", line 89, in main_call_mods
    batch_size, learning_rate, class_num, nproc, is_gpu, f5_args)
  File "/usr/local/bioinfo/src/DeepSignal/deepsignal-0.1.6/deepsignal-0.1.6_venv/lib/python3.6/site-packages/deepsignal/call_modifications.py", line 409, in call_mods
    p.start()
  File "/tools/python/3.6.3/lib/python3.6/multiprocessing/process.py", line 105, in start
    self._popen = self._Popen(self)
  File "/tools/python/3.6.3/lib/python3.6/multiprocessing/context.py", line 223, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "/tools/python/3.6.3/lib/python3.6/multiprocessing/context.py", line 277, in _Popen
    return Popen(process_obj)
  File "/tools/python/3.6.3/lib/python3.6/multiprocessing/popen_fork.py", line 20, in __init__
    self._launch(process_obj)
  File "/tools/python/3.6.3/lib/python3.6/multiprocessing/popen_fork.py", line 74, in _launch
    code = process_obj._bootstrap()
  File "/tools/python/3.6.3/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/tools/python/3.6.3/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/bioinfo/src/DeepSignal/deepsignal-0.1.6/deepsignal-0.1.6_venv/lib/python3.6/site-packages/deepsignal/call_modifications.py", line 167, in _call_mods_q
    saver = tf.train.Saver()
  File "/usr/local/bioinfo/src/DeepSignal/deepsignal-0.1.6/deepsignal-0.1.6_venv/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1338, in __init__
    self.build()
  File "/usr/local/bioinfo/src/DeepSignal/deepsignal-0.1.6/deepsignal-0.1.6_venv/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1347, in build
    self._build(self._filename, build_save=True, build_restore=True)
  File "/usr/local/bioinfo/src/DeepSignal/deepsignal-0.1.6/deepsignal-0.1.6_venv/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1384, in _build
    build_save=build_save, build_restore=build_restore)
  File "/usr/local/bioinfo/src/DeepSignal/deepsignal-0.1.6/deepsignal-0.1.6_venv/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 835, in _build_internal
    restore_sequentially, reshape)
  File "/usr/local/bioinfo/src/DeepSignal/deepsignal-0.1.6/deepsignal-0.1.6_venv/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 472, in _AddRestoreOps
    restore_sequentially)
  File "/usr/local/bioinfo/src/DeepSignal/deepsignal-0.1.6/deepsignal-0.1.6_venv/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 886, in bulk_restore
    return io_ops.restore_v2(filename_tensor, names, slices, dtypes)
  File "/usr/local/bioinfo/src/DeepSignal/deepsignal-0.1.6/deepsignal-0.1.6_venv/lib/python3.6/site-packages/tensorflow/python/ops/gen_io_ops.py", line 1463, in restore_v2
    shape_and_slices=shape_and_slices, dtypes=dtypes, name=name)
  File "/usr/local/bioinfo/src/DeepSignal/deepsignal-0.1.6/deepsignal-0.1.6_venv/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/usr/local/bioinfo/src/DeepSignal/deepsignal-0.1.6/deepsignal-0.1.6_venv/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3392, in create_op
    op_def=op_def)
  File "/usr/local/bioinfo/src/DeepSignal/deepsignal-0.1.6/deepsignal-0.1.6_venv/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1718, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

NotFoundError (see above for traceback): Key BDGRU_rnn/bw/multi_rnn_cell/cell_0/lstm_cell/bias not found in checkpoint
     [[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/devic
e:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

It produces this error but jobs are still running in background (with no activity it appears). It is the first time I encountered this error.

Thanks for your support,

Paul

PengNi commented 3 years ago

Hi Paul,

Maybe it is because the version of deepsignal you use for methyaltion calling is not the same with the version of deepsignal you used for training. Or the model parameters are not set appropriately.

Can you show me the command you used for training and methylation-calling? And the version of deepsignal in your virtual environment?

Best, Peng

pterzian commented 3 years ago

Actually I might be training on gpu with the 0.1.7 and calling on cpu with the 0.1.6. Very basic question but is there a command to get DeepSignal version install ?

Best, Paul

PengNi commented 3 years ago

There is no -v/--version cmd in deepsignal. However, you can use pip list or conda list to list the installed packages with version.

pterzian commented 3 years ago

Indeed it was an issue linked with differing DeepSignal version. It works now. Best, Paul