Closed istiakshihab closed 3 years ago
@pkouris facing the same problem here. Could you perhaps point us in the right direction? Are we doing the data processing right?
Ιt would be helpful to post the particular log of this error.
The problem is, there is no particularly interesting log file is produced ever. It's always 0KB (Empty). The problem begins when it starts processing the dataset. (Loading dictionary and training dataset...). I will follow up with a little more details tomorrow. But could it be happening because Bengali is pretty much different in every other way than English and maybe, it is having a harder time generalizing the whole dataset? @pkouris
I am really sorry for getting back so late, My server crashed for a while.
$ python train.py -model mydata
/home/mr/miniconda3/envs/aienv/lib/python3.5/importlib/_bootstrap.py:222: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 88 from C header, got 96 from PyObject
return f(*args, **kwds)
/home/mr/miniconda3/envs/aienv/lib/python3.5/importlib/_bootstrap.py:222: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
return f(*args, **kwds)
/home/mr/miniconda3/envs/aienv/lib/python3.5/importlib/_bootstrap.py:222: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
return f(*args, **kwds)
Logs
Logs File: mydata_20210412_1412_train_logs.txt
Mode: train
Model: mydata
starting: 2021-04-12 14:12
Parameters:
start_epoch_no = 1
epochs_num = 12
batch_size = 64
embedding_dim = 300
hidden_dim = 200
layers_num = 2
learning_rate = 0.001
beam_width = 4
keep_prob = 0.8
forward_only = False
using_word2vec_embeddings = False
word_embeddings = filename of word embendings
train_restore_saved_model = True
Loading dictionary and training dataset...
Batches per epoch: 2
vocabulary_size: 5307
Article and summary max_len: 889, 890
WARNING:tensorflow:From train.py:106: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.
2021-04-12 14:12:25.736949: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2021-04-12 14:12:25.772434: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3393175000 Hz
2021-04-12 14:12:25.774132: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55f2c404da20 executing computations on platform Host. Devices:
2021-04-12 14:12:25.774227: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): <undefined>, <undefined>
Loading word2vec...
Word embeddings have been loaded.
WARNING:tensorflow:From /home/mr/Workspace/Team8993/AI/abtextsum-1/model.py:27: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.
WARNING:tensorflow:From /home/mr/Workspace/Team8993/AI/abtextsum-1/model.py:27: The name tf.AUTO_REUSE is deprecated. Please use tf.compat.v1.AUTO_REUSE instead.
WARNING:tensorflow:From /home/mr/Workspace/Team8993/AI/abtextsum-1/model.py:28: The name tf.layers.Dense is deprecated. Please use tf.compat.v1.layers.Dense instead.
WARNING:tensorflow:From /home/mr/Workspace/Team8993/AI/abtextsum-1/model.py:30: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.
WARNING:tensorflow:From /home/mr/Workspace/Team8993/AI/abtextsum-1/model.py:43: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.
WARNING:tensorflow:From /home/mr/Workspace/Team8993/AI/abtextsum-1/model.py:50: BasicLSTMCell.__init__ (from tensorflow.python.ops.rnn_cell_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This class is equivalent as tf.keras.layers.LSTMCell, and will be replaced by that in Tensorflow 2.0.
WARNING:tensorflow:From /home/mr/miniconda3/envs/aienv/lib/python3.5/site-packages/tensorflow/contrib/rnn/python/ops/rnn.py:239: bidirectional_dynamic_rnn (from tensorflow.python.ops.rnn) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `keras.layers.Bidirectional(keras.layers.RNN(cell))`, which is equivalent to this API
WARNING:tensorflow:From /home/mr/miniconda3/envs/aienv/lib/python3.5/site-packages/tensorflow/python/ops/rnn.py:464: dynamic_rnn (from tensorflow.python.ops.rnn) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `keras.layers.RNN(cell)`, which is equivalent to this API
WARNING:tensorflow:From /home/mr/miniconda3/envs/aienv/lib/python3.5/site-packages/tensorflow/python/ops/init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:From /home/mr/miniconda3/envs/aienv/lib/python3.5/site-packages/tensorflow/python/ops/rnn_cell_impl.py:738: calling Zeros.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:Entity <bound method BasicLSTMCell.call of <tensorflow.python.ops.rnn_cell_impl.BasicLSTMCell object at 0x7f4037a99d68>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method BasicLSTMCell.call of <tensorflow.python.ops.rnn_cell_impl.BasicLSTMCell object at 0x7f4037a99d68>>: AssertionError: Bad argument number for Name: 3, expecting 4
WARNING:tensorflow:From /home/mr/miniconda3/envs/aienv/lib/python3.5/site-packages/tensorflow/python/ops/rnn.py:244: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
WARNING:tensorflow:Entity <bound method BasicLSTMCell.call of <tensorflow.python.ops.rnn_cell_impl.BasicLSTMCell object at 0x7f3e3847ea90>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method BasicLSTMCell.call of <tensorflow.python.ops.rnn_cell_impl.BasicLSTMCell object at 0x7f3e3847ea90>>: AssertionError: Bad argument number for Name: 3, expecting 4
WARNING:tensorflow:Entity <bound method BasicLSTMCell.call of <tensorflow.python.ops.rnn_cell_impl.BasicLSTMCell object at 0x7f3e3847e128>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method BasicLSTMCell.call of <tensorflow.python.ops.rnn_cell_impl.BasicLSTMCell object at 0x7f3e3847e128>>: AssertionError: Bad argument number for Name: 3, expecting 4
WARNING:tensorflow:Entity <bound method BasicLSTMCell.call of <tensorflow.python.ops.rnn_cell_impl.BasicLSTMCell object at 0x7f3e3847ee48>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method BasicLSTMCell.call of <tensorflow.python.ops.rnn_cell_impl.BasicLSTMCell object at 0x7f3e3847ee48>>: AssertionError: Bad argument number for Name: 3, expecting 4
WARNING:tensorflow:Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f3ddcaec6a0>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f3ddcaec6a0>>: AttributeError: module 'gast' has no attribute 'Index'
WARNING:tensorflow:Entity <bound method AttentionWrapper.call of <tensorflow.contrib.seq2seq.python.ops.attention_wrapper.AttentionWrapper object at 0x7f3f1c640390>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method AttentionWrapper.call of <tensorflow.contrib.seq2seq.python.ops.attention_wrapper.AttentionWrapper object at 0x7f3f1c640390>>: AttributeError: module 'gast' has no attribute 'Index'
WARNING:tensorflow:Entity <bound method BasicLSTMCell.call of <tensorflow.python.ops.rnn_cell_impl.BasicLSTMCell object at 0x7f3e3847ef60>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method BasicLSTMCell.call of <tensorflow.python.ops.rnn_cell_impl.BasicLSTMCell object at 0x7f3e3847ef60>>: AssertionError: Bad argument number for Name: 3, expecting 4
WARNING:tensorflow:Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f3ddf0e76a0>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f3ddf0e76a0>>: AttributeError: module 'gast' has no attribute 'Index'
WARNING:tensorflow:Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f3ddcaec048>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f3ddcaec048>>: AttributeError: module 'gast' has no attribute 'Index'
WARNING:tensorflow:Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f4037a99ac8>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f4037a99ac8>>: AttributeError: module 'gast' has no attribute 'Index'
WARNING:tensorflow:From /home/mr/Workspace/Team8993/AI/abtextsum-1/model.py:115: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
WARNING:tensorflow:From /home/mr/Workspace/Team8993/AI/abtextsum-1/model.py:119: The name tf.train.AdamOptimizer is deprecated. Please use tf.compat.v1.train.AdamOptimizer instead.
WARNING:tensorflow:Entity <bound method BasicLSTMCell.call of <tensorflow.python.ops.rnn_cell_impl.BasicLSTMCell object at 0x7f3f7c676748>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method BasicLSTMCell.call of <tensorflow.python.ops.rnn_cell_impl.BasicLSTMCell object at 0x7f3f7c676748>>: AssertionError: Bad argument number for Name: 3, expecting 4
WARNING:tensorflow:Entity <bound method BasicLSTMCell.call of <tensorflow.python.ops.rnn_cell_impl.BasicLSTMCell object at 0x7f3e3847e0f0>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method BasicLSTMCell.call of <tensorflow.python.ops.rnn_cell_impl.BasicLSTMCell object at 0x7f3e3847e0f0>>: AssertionError: Bad argument number for Name: 3, expecting 4
WARNING:tensorflow:Entity <bound method BasicLSTMCell.call of <tensorflow.python.ops.rnn_cell_impl.BasicLSTMCell object at 0x7f3e3847ee48>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method BasicLSTMCell.call of <tensorflow.python.ops.rnn_cell_impl.BasicLSTMCell object at 0x7f3e3847ee48>>: AssertionError: Bad argument number for Name: 3, expecting 4
WARNING:tensorflow:Entity <bound method BasicLSTMCell.call of <tensorflow.python.ops.rnn_cell_impl.BasicLSTMCell object at 0x7f3ddf074cf8>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method BasicLSTMCell.call of <tensorflow.python.ops.rnn_cell_impl.BasicLSTMCell object at 0x7f3ddf074cf8>>: AssertionError: Bad argument number for Name: 3, expecting 4
WARNING:tensorflow:Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f3ddf05f390>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f3ddf05f390>>: AttributeError: module 'gast' has no attribute 'Index'
WARNING:tensorflow:Entity <bound method AttentionWrapper.call of <tensorflow.contrib.seq2seq.python.ops.attention_wrapper.AttentionWrapper object at 0x7f3ddcf8f780>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method AttentionWrapper.call of <tensorflow.contrib.seq2seq.python.ops.attention_wrapper.AttentionWrapper object at 0x7f3ddcf8f780>>: AttributeError: module 'gast' has no attribute 'Index'
WARNING:tensorflow:Entity <bound method BasicLSTMCell.call of <tensorflow.python.ops.rnn_cell_impl.BasicLSTMCell object at 0x7f3f7c676c18>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method BasicLSTMCell.call of <tensorflow.python.ops.rnn_cell_impl.BasicLSTMCell object at 0x7f3f7c676c18>>: AssertionError: Bad argument number for Name: 3, expecting 4
WARNING:tensorflow:Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f3ddccd82b0>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f3ddccd82b0>>: AttributeError: module 'gast' has no attribute 'Index'
WARNING:tensorflow:Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f3ddd5cedd8>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f3ddd5cedd8>>: AttributeError: module 'gast' has no attribute 'Index'
WARNING:tensorflow:Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f3ed8407b70>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f3ed8407b70>>: AttributeError: module 'gast' has no attribute 'Index'
WARNING:tensorflow:From /home/mr/miniconda3/envs/aienv/lib/python3.5/site-packages/tensorflow/contrib/seq2seq/python/ops/beam_search_decoder.py:985: to_int64 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
2021-04-12 14:15:52.547331: W tensorflow/compiler/jit/mark_for_compilation_pass.cc:1412] (One-time warning): Not using XLA:CPU for cluster because envvar TF_XLA_FLAGS=--tf_xla_cpu_global_jit was not set. If you want XLA:CPU, either set that envvar, or use experimental_jit_scope to enable XLA:CPU. To confirm that XLA is active, pass --vmodule=xla_compilation_cache=1 (as a proper command-line flag, not via TF_XLA_FLAGS) or set the envvar XLA_FLAGS=--xla_hlo_profile.
WARNING:tensorflow:From train.py:128: The name tf.train.Saver is deprecated. Please use tf.compat.v1.train.Saver instead.
Training of new model.
1 of 12 epoch
1 of 1 chunk (files: datasets/train/train_mydata_data/x_chunks/x_001.pickle & datasets/train/train_mydata_data/y_chunks/y_001.pickle)
x & y chunk shape: (70, 889), (70,)
Batches of chunk: 2
Killed
This is what the train terminal shows so far.
And below is the log produced meanwhile:
$ cat mydata_20210412_1412_train_logs.txt
Logs
Logs File: mydata_20210412_1412_train_logs.txt
Mode: train
Model: mydata
starting: 2021-04-12 14:12
Parameters:
start_epoch_no = 1
epochs_num = 12
batch_size = 64
embedding_dim = 300
hidden_dim = 200
layers_num = 2
learning_rate = 0.001
beam_width = 4
keep_prob = 0.8
forward_only = False
using_word2vec_embeddings = False
word_embeddings = filename of word embendings
train_restore_saved_model = True
Loading dictionary and training dataset...
Batches per epoch: 2
vocabulary_size: 5307
Article and summary max_len: 889, 890
Word embeddings have been loaded.
Training of new model.
@pkouris, Do you have any suggestion where am I going wrong.
Also, This dataset contains only 100 summary and articles. So, resources shouldn't be the problem.
This model quits in Sess.run. There could be a memory leak happening. But I can't find any how that can happen. @pkouris
I have also printed out the related variables.
Batch Size
50
Batch X Length Len
50
Batch X Length
[81, 38, 487, 211, 230, 358, 210, 109, 114, 61, 167, 117, 60, 354, 328, 142, 293, 327, 102, 69, 95, 155, 130, 189, 202, 60, 112, 99, 163, 231, 85, 143, 205, 110, 104, 93, 72, 144, 196, 167, 130, 73, 162, 99, 349, 203, 307, 172, 66, 89]
Batch Decoder Length Len
50
batch Decoder Length
[82, 39, 488, 212, 231, 359, 211, 110, 115, 62, 168, 118, 61, 355, 329, 143, 294, 328, 103, 70, 96, 156, 131, 190, 203, 61, 113, 100, 164, 232, 86, 144, 206, 111, 105, 94, 73, 145, 197, 168, 131, 74, 163, 100, 350, 204, 308, 173, 67, 90]
So, I tried to follow the procedure you described here for a Bengali Dataset: https://github.com/pkouris/abtextsum/issues/3
I have tried to follow this in 3 different machines (because I thought it's the limitation of that particular machine) with 30GB RAM and No GPU, 32GB RAM with 2 RTX 2080TI, and my own PC with 12GBs of RAM and no GPU. I have used a dataset having only 50k articles and yet, I always had the process quit on me. Do you have any suggestions for me or can I know what kind of machine you used for your training. Is the memory requirements really that high or am I doing anything wrong?