I am trying to run deepsignal on our HPC GPU, but I get this error:
# ===============================================
## parameters:
input_path:
/home/ls760/nanopore/us/scripts/test_area/ls760/methylation_pipeline/VWD1047/tmp
model_path:
/rds/project/who1000/rds-who1000-wgs10k/WGS10K/data/projects/nanopore/us/resources/methylation_models/deepsignal_human/model.CpG.R9.4_1D.human_hx1.bn17.sn360/bn_17.sn_360.epoch_7.ckpt
is_cnn:
yes
is_rnn:
yes
is_base:
yes
kmer_len:
17
cent_signals_len:
360
batch_size:
512
learning_rate:
0.001
class_num:
2
result_file:
Nanopore_methylationanalysis.tsv_call_mods.tsv
recursively:
yes
corrected_group:
RawGenomeCorrected_000
basecall_subgroup:
BaseCalled_template
reference_path:
/rds/project/who1000/rds-who1000-wgs10k/WGS10K/data/projects/nanopore/us/scripts/test_area/ls760/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna
is_dna:
yes
normalize_method:
mad
methy_label:
1
motifs:
CG
mod_loc:
0
f5_batch_num:
100
positions:
None
nproc:
10
is_gpu:
yes
# ===============================================
898913 fast5 files in total..
parse the motifs string..
read genome reference file..
read position file if it is not None..
/home/ls760/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
/home/ls760/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/home/ls760/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:521: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint16 = np.dtype([("qint16", np.int16, 1)])
/home/ls760/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:522: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/home/ls760/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:523: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint32 = np.dtype([("qint32", np.int32, 1)])
/home/ls760/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:528: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np_resource = np.dtype([("resource", np.ubyte, 1)])
write_process started..
2020-03-03 14:44:05.613202: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2020-03-03 14:44:24.483271: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at save_restore_v2_ops.cc:184 : Not found: Key modelem/bw/multi_rnn_cell/cell_0/lstm_cell/bias not found in checkpoint
Process Process-9:
Traceback (most recent call last):
File "/home/ls760/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1322, in _do_call
return fn(*args)
File "/home/ls760/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1307, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/home/ls760/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1409, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.NotFoundError: Key modelem/bw/multi_rnn_cell/cell_0/lstm_cell/bias not found in checkpoint
[[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ls760/nanopore/us/resources/envs/ont/deepsignalenv_gpu/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/home/ls760/nanopore/us/resources/envs/ont/deepsignalenv_gpu/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/rds/project/who1000/rds-who1000-wgs10k/WGS10K/data/projects/nanopore/us/resources/envs/ont/deepsignalenv_gpu/lib/python3.6/site-packages/deepsignal-0.1.7-py3.6.egg/deepsignal/call_modifications.py", line 171, in _call_mods_q
saver.restore(sess, model_path)
File "/home/ls760/.local/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1802, in restore
{self.saver_def.filename_tensor_name: save_path})
File "/home/ls760/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 900, in run
run_metadata_ptr)
File "/home/ls760/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1135, in _run
feed_dict_tensor, options, run_metadata)
File "/home/ls760/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1316, in _do_run
run_metadata)
File "/home/ls760/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1335, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.NotFoundError: Key modelem/bw/multi_rnn_cell/cell_0/lstm_cell/bias not found in checkpoint
[[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]
Caused by op 'save/RestoreV2', defined at:
File "/rds/project/who1000/rds-who1000-wgs10k/WGS10K/data/projects/nanopore/us/resources/envs/ont/deepsignalenv_gpu/bin/deepsignal", line 11, in <module>
load_entry_point('deepsignal==0.1.7', 'console_scripts', 'deepsignal')()
File "/rds/project/who1000/rds-who1000-wgs10k/WGS10K/data/projects/nanopore/us/resources/envs/ont/deepsignalenv_gpu/lib/python3.6/site-packages/deepsignal-0.1.7-py3.6.egg/deepsignal/deepsignal.py", line 423, in main
args.func(args)
File "/rds/project/who1000/rds-who1000-wgs10k/WGS10K/data/projects/nanopore/us/resources/envs/ont/deepsignalenv_gpu/lib/python3.6/site-packages/deepsignal-0.1.7-py3.6.egg/deepsignal/deepsignal.py", line 87, in main_call_mods
f5_args)
File "/rds/project/who1000/rds-who1000-wgs10k/WGS10K/data/projects/nanopore/us/resources/envs/ont/deepsignalenv_gpu/lib/python3.6/site-packages/deepsignal-0.1.7-py3.6.egg/deepsignal/call_modifications.py", line 393, in call_mods
is_rnn, is_base, is_cnn)
File "/rds/project/who1000/rds-who1000-wgs10k/WGS10K/data/projects/nanopore/us/resources/envs/ont/deepsignalenv_gpu/lib/python3.6/site-packages/deepsignal-0.1.7-py3.6.egg/deepsignal/call_modifications.py", line 339, in _call_mods_from_fast5s_gpu
p_call_mods_gpu.start()
File "/home/ls760/nanopore/us/resources/envs/ont/deepsignalenv_gpu/lib/python3.6/multiprocessing/process.py", line 105, in start
self._popen = self._Popen(self)
File "/home/ls760/nanopore/us/resources/envs/ont/deepsignalenv_gpu/lib/python3.6/multiprocessing/context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/home/ls760/nanopore/us/resources/envs/ont/deepsignalenv_gpu/lib/python3.6/multiprocessing/context.py", line 277, in _Popen
return Popen(process_obj)
File "/home/ls760/nanopore/us/resources/envs/ont/deepsignalenv_gpu/lib/python3.6/multiprocessing/popen_fork.py", line 19, in __init__
self._launch(process_obj)
File "/home/ls760/nanopore/us/resources/envs/ont/deepsignalenv_gpu/lib/python3.6/multiprocessing/popen_fork.py", line 73, in _launch
code = process_obj._bootstrap()
File "/home/ls760/nanopore/us/resources/envs/ont/deepsignalenv_gpu/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/home/ls760/nanopore/us/resources/envs/ont/deepsignalenv_gpu/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/rds/project/who1000/rds-who1000-wgs10k/WGS10K/data/projects/nanopore/us/resources/envs/ont/deepsignalenv_gpu/lib/python3.6/site-packages/deepsignal-0.1.7-py3.6.egg/deepsignal/call_modifications.py", line 170, in _call_mods_q
saver = tf.train.Saver()
File "/home/ls760/.local/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1338, in __init__
self.build()
File "/home/ls760/.local/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1347, in build
self._build(self._filename, build_save=True, build_restore=True)
File "/home/ls760/.local/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1384, in _build
build_save=build_save, build_restore=build_restore)
File "/home/ls760/.local/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 835, in _build_internal
restore_sequentially, reshape)
File "/home/ls760/.local/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 472, in _AddRestoreOps
restore_sequentially)
File "/home/ls760/.local/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 886, in bulk_restore
return io_ops.restore_v2(filename_tensor, names, slices, dtypes)
File "/home/ls760/.local/lib/python3.6/site-packages/tensorflow/python/ops/gen_io_ops.py", line 1463, in restore_v2
shape_and_slices=shape_and_slices, dtypes=dtypes, name=name)
File "/home/ls760/.local/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/ls760/.local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3392, in create_op
op_def=op_def)
File "/home/ls760/.local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1718, in __init__
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
NotFoundError (see above for traceback): Key modelem/bw/multi_rnn_cell/cell_0/lstm_cell/bias not found in checkpoint
[[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]
Searching a bit on the internet looks like it is a model problem. would oyu agree? do you have any idea on how to solve it?
Thanks for your interest. It looks like you are using deepsignal v0.1.7. the model model.CpG.R9.4_1D.human_hx1.bn17.sn360.v0.1.7+.tar.gz(google drive) should be used.
Hi PengNi,
I am trying to run deepsignal on our HPC GPU, but I get this error:
Searching a bit on the internet looks like it is a model problem. would oyu agree? do you have any idea on how to solve it?
Thanks, Luca