hellochick / ICNet-tensorflow

TensorFlow-based implementation of "ICNet for Real-Time Semantic Segmentation on High-Resolution Images".
406 stars 153 forks source link

RuntimeError: cannot join current thread #75

Closed Yancy7 closed 6 years ago

Yancy7 commented 6 years ago

Hi ! I trained the net on my own dataset. it's OK when the input image is pretty smaller like 400x400. But as long as I use a bigger image(more than 1000 x 1000), it will alert this error information: RuntimeError: cannot join current thread `Caused by op 'ResizeBilinear_19', defined at: File "inference.py", line 186, in main() File "inference.py", line 146, in main raw_output_up = tf.image.resize_bilinear(raw_output, size=n_shape, align_corners=True) File "/home/test/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gen_image_ops.py", line 2372, in resize_bilinear align_corners=align_corners, name=name) File "/home/test/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/home/test/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3392, in create_op op_def=op_def) File "/home/test/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1718, in init self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[1,2464,3296,150] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[Node: ResizeBilinear_19 = ResizeBilinear[T=DT_FLOAT, align_corners=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](conv6_cls/BiasAdd-0-0-TransposeNCHWToNHWC-LayoutOptimizer, ResizeBilinear_19/size)]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.`

Traceback (most recent call last): File "/home/test/anaconda3/lib/python3.6/site-packages/tqdm/_tqdm.py", line 885, in __del__ self.close() File "/home/test/anaconda3/lib/python3.6/site-packages/tqdm/_tqdm.py", line 1090, in close self._decr_instances(self) File "/home/test/anaconda3/lib/python3.6/site-packages/tqdm/_tqdm.py", line 454, in _decr_instances cls.monitor.exit() File "/home/test/anaconda3/lib/python3.6/site-packages/tqdm/_monitor.py", line 52, in exit self.join() File "/home/test/anaconda3/lib/python3.6/threading.py", line 1053, in join raise RuntimeError("cannot join current thread") RuntimeError: cannot join current thread I wanna know what's going wrong and sincerely hope your help, thx so much!

Yancy7 commented 6 years ago

is that caused by the input image being too large ?? I've thought that the project will help to resize the input image ? what should I do now ?.. resize every input images to be smaller ? thx !

BNAadministrator3 commented 6 years ago

hi friend, have you figure it out?

Yancy7 commented 6 years ago

hi friend, have you figure it out?

I have no idea ... I tried modifying the network and at last it worked ... but actually I don't know the cause of the problem...