When I use the default value(40) of Processes Number to train RNN Model, below error occurs.
/home/qingbol/.conda/envs/tf110cpu_py27/lib/python2.7/site-packages/tensorflow/python/ops/gradients_impl.py:93: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
Created the model!
[Batch 0][Epoch 0] cost: 2.773; accuracy: 0.027
Traceback (most recent call last):
File "train.py", line 283, in
main()
File "train.py", line 279, in main
training(config_info)
File "train.py", line 218, in training
model.train()
File "train.py", line 153, in train
self._data, self._label, self._length, self._keep_prob)
File "train.py", line 33, in fill_feed_dict
data_batch = data_set.get_batch(batch_size=batch_size)
File "/scratch2/qingbol/EKLAVYA2/code/RNN/train/dataset.py", line 240, in get_batch
train_batch = self.get_batch_data(func_list_batch)
File "/scratch2/qingbol/EKLAVYA2/code/RNN/train/dataset.py", line 141, in get_batch_data
pool = Pool(self.thread_num)
File "/home/qingbol/.conda/envs/tf110cpu_py27/lib/python2.7/multiprocessing/init.py", line 232, in Pool
return Pool(processes, initializer, initargs, maxtasksperchild)
File "/home/qingbol/.conda/envs/tf110cpu_py27/lib/python2.7/multiprocessing/pool.py", line 161, in init
self._repopulate_pool()
File "/home/qingbol/.conda/envs/tf110cpu_py27/lib/python2.7/multiprocessing/pool.py", line 225, in _repopulate_pool
w.start()
File "/home/qingbol/.conda/envs/tf110cpu_py27/lib/python2.7/multiprocessing/process.py", line 130, in start
self._popen = Popen(self)
File "/home/qingbol/.conda/envs/tf110cpu_py27/lib/python2.7/multiprocessing/forking.py", line 121, in init
self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory
So I decrease the processes numbers to 16, the above error disappear. But another issue come out, the program was interrupted without giving error message.
UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
When I use the default value(40) of Processes Number to train RNN Model, below error occurs.
So I decrease the processes numbers to 16, the above error disappear. But another issue come out, the program was interrupted without giving error message.
In both cases, it gives the same warning message:
Any clue or solution?