I used the reddit scripts to generate a vali.num, train.num, and test.num from 2011-05. However, when running with this data, I get a tmcalloc warning for 32731955200 bytes (32 GB). My RAM on the machine I am running this with (Google Colab) has only about 12 GB.
The toy dataset works fine.
Full log below:
@@@@@@@@@@@@@@@@@@@@
hostname: 28137a6dc590
data_path: data
out_path: out
@@@@@@@@@@@@@@@@@@@@
Using TensorFlow backend.
loss: --------------------
10.00 <function _sqrt_mse at 0x7f186799a730>
-10.00 <function _batch_spread at 0x7f1867a0bbf8>
-10.00 <function _batch_spread at 0x7f1867a0bbf8>
0.33 categorical_crossentropy
0.33 categorical_crossentropy
0.33 categorical_crossentropy
--------------------
out/reddit_width(128, 128, 0.0)_depth(2, 2)/mtask_interp_std0.10_ST10.00_SS10.00_TT10.00
already exists, do you want to delete the folder? (y/n)
y
fld deleted
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:541: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:66: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4432: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:3239: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4409: The name tf.random_normal is deprecated. Please use tf.random.normal instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/optimizers.py:793: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.
WARNING:tensorflow:From /content/SpaceFusion/src/model.py:480: The name tf.squared_difference is deprecated. Please use tf.math.squared_difference instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:3576: The name tf.log is deprecated. Please use tf.math.log instead.
out/reddit_width(128, 128, 0.0)_depth(2, 2)/mtask_interp_std0.10_ST10.00_SS10.00_TT10.00
***** Epoch 1/20, trained 0.00M *****
loading data, check_src = False...
tcmalloc: large alloc 32731955200 bytes == 0x7c5c000 @ 0x7f18bb2bf001 0x7f18b6970765 0x7f18b69d4dc0 0x7f18b69d6c5f 0x7f18b6a6d238 0x50ac25 0x50c5b9 0x508245 0x50a080 0x50aa7d 0x50c5b9 0x509d48 0x50aa7d 0x50c5b9 0x508245 0x50a080 0x50aa7d 0x50d390 0x509d48 0x50aa7d 0x50c5b9 0x508245 0x50b403 0x635222 0x6352d7 0x638a8f 0x639631 0x4b0f40 0x7f18baebab97 0x5b2fda
^C
Sorry for late reply.
Could you please try with smaller batch size? e.g. python src/main.py mtask train --data_name=toy --batch_size=32
(you may need to git pull first)
I used the reddit scripts to generate a
vali.num
,train.num
, andtest.num
from 2011-05. However, when running with this data, I get atmcalloc
warning for32731955200
bytes (32 GB). My RAM on the machine I am running this with (Google Colab) has only about 12 GB.The toy dataset works fine.
Full log below:
How can I reduce this memory alloc?
Thank you.