Closed anupamjamatia closed 5 years ago
You are using CUDA 10.0 which means that you are using a tensorflow version != 1.2 which is only compatible with CUDA 8.0.
So is not it possible to run the code in my system which has the following configuration
$ nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2018 NVIDIA Corporation Built on Sat_Aug_25_21:08:01_CDT_2018 Cuda compilation tools, release 10.0, V10.0.130
and
`$ pip list Package Version
absl-py 0.6.1
astor 0.7.1
backcall 0.1.0
backports.weakref 1.0rc1
bilm 0.1.post5
bleach 1.5.0
certifi 2018.11.29
cycler 0.10.0
decorator 4.3.0
entrypoints 0.2.3
enum34 1.1.6
gast 0.2.0
grpcio 1.17.1
h5py 2.9.0
html5lib 0.9999999
ipykernel 5.1.0
ipython 7.2.0
ipython-genutils 0.2.0
ipywidgets 7.4.2
jedi 0.13.2
Jinja2 2.10
jsonschema 2.6.0
jupyter 1.0.0
jupyter-client 5.2.4
jupyter-console 6.0.0
jupyter-core 4.4.0
Keras 2.2.4
Keras-Applications 1.0.7
keras-metrics 0.0.5
Keras-Preprocessing 1.0.5
kiwisolver 1.0.1
Markdown 2.2.0
MarkupSafe 1.1.0
matplotlib 3.0.2
mistune 0.8.4
mkl-fft 1.0.6
mkl-random 1.0.2
mock 2.0.0
nbconvert 5.3.1
nbformat 4.4.0
nltk 3.4
notebook 5.7.4
numpy 1.16.1
pandas 0.23.4
pandas-ml 0.5.0
pandocfilters 1.4.2
parso 0.3.1
pbr 5.1.1
pexpect 4.6.0
pickleshare 0.7.5
pip 19.0.1
prometheus-client 0.5.0
prompt-toolkit 2.0.7
protobuf 3.6.1
ptyprocess 0.6.0
pydot-ng 2.0.0
Pygments 2.3.1
pyparsing 2.3.1
Pyphen 0.9.5
python-dateutil 2.7.5
pytz 2018.7
PyYAML 3.13
pyzmq 17.1.2
qtconsole 4.4.3
scikit-learn 0.20.2
scipy 1.2.0
seaborn 0.9.0
Send2Trash 1.5.0
setuptools 40.6.3
singledispatch 3.4.0.3
six 1.12.0
sklearn 0.0
tensorboard 1.12.2
tensorflow 1.12.0
tensorflow-gpu 1.2.0
termcolor 1.1.0
terminado 0.8.1
testpath 0.4.2
textblob 0.15.2
Theano 1.0.3
tornado 5.1.1
traitlets 4.3.2
wcwidth 0.1.7
webencodings 0.5.1
Werkzeug 0.14.1
wheel 0.32.3
widgetsnbextension 3.4.2
You are using pip version 19.0.1, however version 19.0.2 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.`
Hi, I am facing problem to train the ELMo in the new corpus I run the instructions given in the [https://github.com/allenai/bilm-tf] in my system which has a single GPU and getting the killed signal. Any solution for the problem?
(ELMO) anupam@anupam-OMEN-HP:~/Desktop/ElMo/bilm$ python bin/train_elmo.py --train_prefix='/home/anupam/Desktop/ElMo/bilm/training_file/*' --vocab_file en-bn-hi_mixed_voc_file.txt --save_dir out/ Found 1 shards at /home/anupam/Desktop/ElMo/bilm/training_file/* Loading data from: /home/anupam/Desktop/ElMo/bilm/training_file/Training_File.txt Loaded 667648 sentences. Finished loading Found 1 shards at /home/anupam/Desktop/ElMo/bilm/training_file/* Loading data from: /home/anupam/Desktop/ElMo/bilm/training_file/Training_File.txt Loaded 667648 sentences. Finished loading USING SKIP CONNECTIONS [['global_step:0', TensorShape([])], ['lm/CNN/W_cnn_0:0', TensorShape([Dimension(1), Dimension(1), Dimension(16), Dimension(32)])], ['lm/CNN/W_cnn_1:0', TensorShape([Dimension(1), Dimension(2), Dimension(16), Dimension(32)])], ['lm/CNN/W_cnn_2:0', TensorShape([Dimension(1), Dimension(3), Dimension(16), Dimension(64)])], ['lm/CNN/W_cnn_3:0', TensorShape([Dimension(1), Dimension(4), Dimension(16), Dimension(128)])], ['lm/CNN/W_cnn_4:0', TensorShape([Dimension(1), Dimension(5), Dimension(16), Dimension(256)])], ['lm/CNN/W_cnn_5:0', TensorShape([Dimension(1), Dimension(6), Dimension(16), Dimension(512)])], ['lm/CNN/W_cnn_6:0', TensorShape([Dimension(1), Dimension(7), Dimension(16), Dimension(1024)])], ['lm/CNN/b_cnn_0:0', TensorShape([Dimension(32)])], ['lm/CNN/b_cnn_1:0', TensorShape([Dimension(32)])], ['lm/CNN/b_cnn_2:0', TensorShape([Dimension(64)])], ['lm/CNN/b_cnn_3:0', TensorShape([Dimension(128)])], ['lm/CNN/b_cnn_4:0', TensorShape([Dimension(256)])], ['lm/CNN/b_cnn_5:0', TensorShape([Dimension(512)])], ['lm/CNN/b_cnn_6:0', TensorShape([Dimension(1024)])], ['lm/CNN_high_0/W_carry:0', TensorShape([Dimension(2048), Dimension(2048)])], ['lm/CNN_high_0/W_transform:0', TensorShape([Dimension(2048), Dimension(2048)])], ['lm/CNN_high_0/b_carry:0', TensorShape([Dimension(2048)])], ['lm/CNN_high_0/b_transform:0', TensorShape([Dimension(2048)])], ['lm/CNN_high_1/W_carry:0', TensorShape([Dimension(2048), Dimension(2048)])], ['lm/CNN_high_1/W_transform:0', TensorShape([Dimension(2048), Dimension(2048)])], ['lm/CNN_high_1/b_carry:0', TensorShape([Dimension(2048)])], ['lm/CNN_high_1/b_transform:0', TensorShape([Dimension(2048)])], ['lm/CNN_proj/W_proj:0', TensorShape([Dimension(2048), Dimension(512)])], ['lm/CNN_proj/b_proj:0', TensorShape([Dimension(512)])], ['lm/RNN_0/rnn/multi_rnn_cell/cell_0/lstm_cell/bias:0', TensorShape([Dimension(16384)])], ['lm/RNN_0/rnn/multi_rnn_cell/cell_0/lstm_cell/kernel:0', TensorShape([Dimension(1024), Dimension(16384)])], ['lm/RNN_0/rnn/multi_rnn_cell/cell_0/lstm_cell/projection/kernel:0', TensorShape([Dimension(4096), Dimension(512)])], ['lm/RNN_0/rnn/multi_rnn_cell/cell_1/lstm_cell/bias:0', TensorShape([Dimension(16384)])], ['lm/RNN_0/rnn/multi_rnn_cell/cell_1/lstm_cell/kernel:0', TensorShape([Dimension(1024), Dimension(16384)])], ['lm/RNN_0/rnn/multi_rnn_cell/cell_1/lstm_cell/projection/kernel:0', TensorShape([Dimension(4096), Dimension(512)])], ['lm/RNN_1/rnn/multi_rnn_cell/cell_0/lstm_cell/bias:0', TensorShape([Dimension(16384)])], ['lm/RNN_1/rnn/multi_rnn_cell/cell_0/lstm_cell/kernel:0', TensorShape([Dimension(1024), Dimension(16384)])], ['lm/RNN_1/rnn/multi_rnn_cell/cell_0/lstm_cell/projection/kernel:0', TensorShape([Dimension(4096), Dimension(512)])], ['lm/RNN_1/rnn/multi_rnn_cell/cell_1/lstm_cell/bias:0', TensorShape([Dimension(16384)])], ['lm/RNN_1/rnn/multi_rnn_cell/cell_1/lstm_cell/kernel:0', TensorShape([Dimension(1024), Dimension(16384)])], ['lm/RNN_1/rnn/multi_rnn_cell/cell_1/lstm_cell/projection/kernel:0', TensorShape([Dimension(4096), Dimension(512)])], ['lm/char_embed:0', TensorShape([Dimension(261), Dimension(16)])], ['lm/softmax/W:0', TensorShape([Dimension(1525043), Dimension(512)])], ['lm/softmax/b:0', TensorShape([Dimension(1525043)])], ['train_perplexity:0', TensorShape([])]] WARNING:tensorflow:From /home/anupam/.local/lib/python3.5/site-packages/tensorflow/python/util/tf_should_use.py:170: initialize_all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02. Instructions for updating: Use
tf.global_variables_initializerinstead. 2019-02-08 17:07:46.692619: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations. 2019-02-08 17:07:46.692636: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations. 2019-02-08 17:07:46.692641: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations. 2019-02-08 17:07:46.692645: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations. 2019-02-08 17:07:46.692649: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations. Killed (ELMO) anupam@anupam-OMEN-HP:~/Desktop/ElMo/bilm$
GPU can be viewd here` anupam@anupam-OMEN-HP:~$ nvidia-smi Fri Feb 8 18:24:06 2019
+-----------------------------------------------------------------------------+ | NVIDIA-SMI 410.79 Driver Version: 410.79 CUDA Version: 10.0 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce GTX 107... Off | 00000000:01:00.0 On | N/A | | N/A 55C P3 23W / N/A | 966MiB / 8117MiB | 0% Default | +-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 1665 G /usr/lib/xorg/Xorg 837MiB | | 0 3429 G compiz 50MiB | | 0 4058 G ...quest-channel-token=8843362536060243018 55MiB | | 0 4726 G ...quest-channel-token=6712252360503423866 13MiB | | 0 7177 G /usr/lib/thunderbird/thunderbird 2MiB | | 0 9805 G ...-token=82F8B3D416718D1A9486CF4518D0A1FF 4MiB | +-----------------------------------------------------------------------------+ `