I tried to run "train.py"(process 5 train at "usage").
But, I get the following "ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape" error.
Shall I change the batch size to be small?
How to change batch size ?
2019-08-25 14:54:06.964697: W tensorflow/core/common_runtime/bfc_allocator.cc:271] *x****x*****
2019-08-25 14:54:06.964735: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at conv_ops.cc:446 : Resource exhausted: OOM when allocating tensor with shape[16,128,8192,1] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
Traceback (most recent call last):
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
return fn(*args)
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[16,128,8192,1] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node fa_layer4/conv_2/Conv2D}} = Conv2D[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](fa_layer4/conv_1/Relu, fa_layer4/conv_2/weights/read/_211)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
[[{{node gradients/layer2/conv2/BiasAdd_grad/BiasAddGrad/_433}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_4838_gradients/layer2/conv2/BiasAdd_grad/BiasAddGrad", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "train.py", line 469, in
train()
File "train.py", line 437, in train
train_one_epoch(sess, ops, train_writer, stack_train)
File "train.py", line 243, in train_one_epoch
feed_dict=feed_dict,
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 929, in run
run_metadata_ptr)
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _run
feed_dict_tensor, options, run_metadata)
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
run_metadata)
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[16,128,8192,1] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[node fa_layer4/conv_2/Conv2D (defined at /home/hiwasawa/PointNet2/Open3D-PointNet2-Semantic3D-master/util/tf_util.py:186) = Conv2D[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](fa_layer4/conv_1/Relu, fa_layer4/conv_2/weights/read/_211)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
[[{{node gradients/layer2/conv2/BiasAdd_grad/BiasAddGrad/_433}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_4838_gradients/layer2/conv2/BiasAdd_grad/BiasAddGrad", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
Caused by op 'fa_layer4/conv_2/Conv2D', defined at:
File "train.py", line 469, in
train()
File "train.py", line 359, in train
bn_decay=bn_decay,
File "/home/hiwasawa/PointNet2/Open3D-PointNet2-Semantic3D-master/model.py", line 128, in get_model
scope="fa_layer4",
File "/home/hiwasawa/PointNet2/Open3D-PointNet2-Semantic3D-master/util/pointnet_util.py", line 323, in pointnet_fp_module
bn_decay=bn_decay,
File "/home/hiwasawa/PointNet2/Open3D-PointNet2-Semantic3D-master/util/tf_util.py", line 186, in conv2d
data_format=data_format,
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 957, in conv2d
data_format=data_format, dilations=dilations, name=name)
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
op_def=op_def)
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1770, in init
self._traceback = tf_stack.extract_stack()
ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[16,128,8192,1] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[node fa_layer4/conv_2/Conv2D (defined at /home/hiwasawa/PointNet2/Open3D-PointNet2-Semantic3D-master/util/tf_util.py:186) = Conv2D[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](fa_layer4/conv_1/Relu, fa_layer4/conv_2/weights/read/_211)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
[[{{node gradients/layer2/conv2/BiasAdd_grad/BiasAddGrad/_433}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_4838_gradients/layer2/conv2/BiasAdd_grad/BiasAddGrad", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
I tried to run "train.py"(process 5 train at "usage"). But, I get the following "ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape" error.
Shall I change the batch size to be small? How to change batch size ?
[my environment] gpu Geforce1050 ti memory 16 GiB swap 16 GiB ubuntu 16.04 cuda 9.0 cudnn 7.5.0 [anaconda3] python 3.6 tensorflow-gpu 1.12.0 scikit-learn 0.21.3 open3d-python 0.7.0.0
2019-08-25 14:54:06.964697: W tensorflow/core/common_runtime/bfc_allocator.cc:271] *x****x***** 2019-08-25 14:54:06.964735: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at conv_ops.cc:446 : Resource exhausted: OOM when allocating tensor with shape[16,128,8192,1] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc Traceback (most recent call last): File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call return fn(*args) File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn options, feed_dict, fetch_list, target_list, run_metadata) File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[16,128,8192,1] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[{{node fa_layer4/conv_2/Conv2D}} = Conv2D[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](fa_layer4/conv_1/Relu, fa_layer4/conv_2/weights/read/_211)]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
[[{{node gradients/layer2/conv2/BiasAdd_grad/BiasAddGrad/_433}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_4838_gradients/layer2/conv2/BiasAdd_grad/BiasAddGrad", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "train.py", line 469, in
train()
File "train.py", line 437, in train
train_one_epoch(sess, ops, train_writer, stack_train)
File "train.py", line 243, in train_one_epoch
feed_dict=feed_dict,
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 929, in run
run_metadata_ptr)
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _run
feed_dict_tensor, options, run_metadata)
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
run_metadata)
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[16,128,8192,1] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[node fa_layer4/conv_2/Conv2D (defined at /home/hiwasawa/PointNet2/Open3D-PointNet2-Semantic3D-master/util/tf_util.py:186) = Conv2D[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](fa_layer4/conv_1/Relu, fa_layer4/conv_2/weights/read/_211)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
[[{{node gradients/layer2/conv2/BiasAdd_grad/BiasAddGrad/_433}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_4838_gradients/layer2/conv2/BiasAdd_grad/BiasAddGrad", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
Caused by op 'fa_layer4/conv_2/Conv2D', defined at: File "train.py", line 469, in
train()
File "train.py", line 359, in train
bn_decay=bn_decay,
File "/home/hiwasawa/PointNet2/Open3D-PointNet2-Semantic3D-master/model.py", line 128, in get_model
scope="fa_layer4",
File "/home/hiwasawa/PointNet2/Open3D-PointNet2-Semantic3D-master/util/pointnet_util.py", line 323, in pointnet_fp_module
bn_decay=bn_decay,
File "/home/hiwasawa/PointNet2/Open3D-PointNet2-Semantic3D-master/util/tf_util.py", line 186, in conv2d
data_format=data_format,
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 957, in conv2d
data_format=data_format, dilations=dilations, name=name)
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
op_def=op_def)
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1770, in init
self._traceback = tf_stack.extract_stack()
ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[16,128,8192,1] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[node fa_layer4/conv_2/Conv2D (defined at /home/hiwasawa/PointNet2/Open3D-PointNet2-Semantic3D-master/util/tf_util.py:186) = Conv2D[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](fa_layer4/conv_1/Relu, fa_layer4/conv_2/weights/read/_211)]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
[[{{node gradients/layer2/conv2/BiasAdd_grad/BiasAddGrad/_433}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_4838_gradients/layer2/conv2/BiasAdd_grad/BiasAddGrad", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.