TuSimple / TuSimple-DUC

Understanding Convolution for Semantic Segmentation
https://arxiv.org/abs/1702.08502
Apache License 2.0
605 stars 118 forks source link

simple_bind error during Cityscapes testing #10

Closed atward424 closed 6 years ago

atward424 commented 6 years ago

Hi,

I'm trying to run the Cityscapes classifier in test mode on an Amazon EC2 instance, but I'm running into this error:

$ python predict_full_image.py ../configs/test/test_full_image.cfg
[06:22:55] src/nnvm/legacy_json_util.cc:190: Loading symbol saved by previous version v0.8.0. Attempting to upgrade...
[06:22:55] src/nnvm/legacy_json_util.cc:198: Symbol successfully upgraded!
/home/ubuntu/TuSimple-DUC/mxnet/python/mxnet/module/base_module.py:53: UserWarning: You created Module with Module(..., label_names=['softmax_label']) but input with name 'softmax_label' is not found in symbol.list_arguments(). Did you mean one of:
        data
        seg_loss_label
  warnings.warn(msg)
/home/ubuntu/TuSimple-DUC/mxnet/python/mxnet/module/base_module.py:65: UserWarning: Data provided by label_shapes don't match names specified by label_names ([] vs. ['softmax_label'])
  warnings.warn(msg)
[06:22:55] /home/ubuntu/TuSimple-DUC/mxnet/dmlc-core/include/dmlc/./logging.h:308: [06:22:55] src/storage/storage.cc:113: Compile with USE_CUDA=1 to enable GPU usage

Stack trace returned 10 entries:
[bt] (0) /home/ubuntu/TuSimple-DUC/mxnet/python/mxnet/../../lib/libmxnet.so(+0x12a7b3a) [0x7f95e0047b3a]
[bt] (1) /home/ubuntu/TuSimple-DUC/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN5mxnet11StorageImpl5AllocEmNS_7ContextE+0x57) [0x7f95e0048317]
[bt] (2) /home/ubuntu/TuSimple-DUC/mxnet/python/mxnet/../../lib/libmxnet.so(+0x131869f) [0x7f95e00b869f]
[bt] (3) /home/ubuntu/TuSimple-DUC/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN5mxnet4exec15ReshapeOrCreateERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERKN4nnvm6TShapeEiRKNS_7ContextEPSt13unordered_mapIS6_NS_7NDArrayESt4hashIS6_ESt8equal_toIS6_ESaISt4pairIS7_SH_EEE+0xa4f) [0x7f95e00be58f]
[bt] (4) /home/ubuntu/TuSimple-DUC/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN5mxnet4exec13GraphExecutor13InitArgumentsERKN4nnvm12IndexedGraphERKSt6vectorINS2_6TShapeESaIS7_EERKS6_IiSaIiEERKS6_INS_7ContextESaISG_EESK_SK_RKS6_INS_9OpReqTypeESaISL_EERKSt13unordered_setINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESt4hashISW_ESt8equal_toISW_ESaISW_EEPKNS_8ExecutorEPSt13unordered_mapISW_NS_7NDArrayESY_S10_SaISt4pairIKSW_S19_EEEPS6_IS19_SaIS19_EES1I_S1I_+0xaa4) [0x7f95e00c1f24]
[bt] (5) /home/ubuntu/TuSimple-DUC/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN5mxnet4exec13GraphExecutor4InitEN4nnvm6SymbolERKNS_7ContextERKSt3mapINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES4_St4lessISD_ESaISt4pairIKSD_S4_EEERKSt6vectorIS4_SaIS4_EESR_SR_RKSt13unordered_mapISD_NS2_6TShapeESt4hashISD_ESt8equal_toISD_ESaISG_ISH_ST_EEERKSS_ISD_iSV_SX_SaISG_ISH_iEEERKSN_INS_9OpReqTypeESaIS18_EERKSt13unordered_setISD_SV_SX_SaISD_EEPSN_INS_7NDArrayESaIS1I_EES1L_S1L_PSS_ISD_S1I_SV_SX_SaISG_ISH_S1I_EEEPNS_8ExecutorERKSS_INS2_9NodeEntryES1I_NS2_13NodeEntryHashENS2_14NodeEntryEqualESaISG_IKS1S_S1I_EEE+0x88b) [0x7f95e00c9edb]
[bt] (6) /home/ubuntu/TuSimple-DUC/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN5mxnet8Executor10SimpleBindEN4nnvm6SymbolERKNS_7ContextERKSt3mapINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES3_St4lessISC_ESaISt4pairIKSC_S3_EEERKSt6vectorIS3_SaIS3_EESQ_SQ_RKSt13unordered_mapISC_NS1_6TShapeESt4hashISC_ESt8equal_toISC_ESaISF_ISG_SS_EEERKSR_ISC_iSU_SW_SaISF_ISG_iEEERKSM_INS_9OpReqTypeESaIS17_EERKSt13unordered_setISC_SU_SW_SaISC_EEPSM_INS_7NDArrayESaIS1H_EES1K_S1K_PSR_ISC_S1H_SU_SW_SaISF_ISG_S1H_EEEPS0_+0x233) [0x7f95e00ca583]
[bt] (7) /home/ubuntu/TuSimple-DUC/mxnet/python/mxnet/../../lib/libmxnet.so(MXExecutorSimpleBind+0x2a67) [0x7f95e00888c7]
[bt] (8) /usr/lib/x86_64-linux-gnu/libffi.so.6(ffi_call_unix64+0x4c) [0x7f9603afae40]
[bt] (9) /usr/lib/x86_64-linux-gnu/libffi.so.6(ffi_call+0x2eb) [0x7f9603afa8ab]

Traceback (most recent call last):
  File "predict_full_image.py", line 100, in <module>
    tester = ImageListTester(config)
  File "predict_full_image.py", line 33, in __init__
    self.tester = Tester(self.config)
  File "/home/ubuntu/TuSimple-DUC/tusimple_duc/test/tester.py", line 42, in __init__
    predictor.bind(data_shapes=[('data', data_shape)], for_training=False)
  File "/home/ubuntu/TuSimple-DUC/mxnet/python/mxnet/module/module.py", line 417, in bind
    state_names=self._state_names)
  File "/home/ubuntu/TuSimple-DUC/mxnet/python/mxnet/module/executor_group.py", line 231, in __init__
    self.bind_exec(data_shapes, label_shapes, shared_group)
  File "/home/ubuntu/TuSimple-DUC/mxnet/python/mxnet/module/executor_group.py", line 327, in bind_exec
    shared_group))
  File "/home/ubuntu/TuSimple-DUC/mxnet/python/mxnet/module/executor_group.py", line 603, in _bind_ith_exec
    shared_buffer=shared_data_arrays, **input_shapes)
  File "/home/ubuntu/TuSimple-DUC/mxnet/python/mxnet/symbol.py", line 1479, in simple_bind
    raise RuntimeError(error_msg)
RuntimeError: simple_bind error. Arguments:
data: (1, 3, 1024, 2048)
[06:22:55] src/storage/storage.cc:113: Compile with USE_CUDA=1 to enable GPU usage

Stack trace returned 10 entries:
[bt] (0) /home/ubuntu/TuSimple-DUC/mxnet/python/mxnet/../../lib/libmxnet.so(+0x12a7b3a) [0x7f95e0047b3a]
[bt] (1) /home/ubuntu/TuSimple-DUC/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN5mxnet11StorageImpl5AllocEmNS_7ContextE+0x57) [0x7f95e0048317]
[bt] (2) /home/ubuntu/TuSimple-DUC/mxnet/python/mxnet/../../lib/libmxnet.so(+0x131869f) [0x7f95e00b869f]
[bt] (3) /home/ubuntu/TuSimple-DUC/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN5mxnet4exec15ReshapeOrCreateERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERKN4nnvm6TShapeEiRKNS_7ContextEPSt13unordered_mapIS6_NS_7NDArrayESt4hashIS6_ESt8equal_toIS6_ESaISt4pairIS7_SH_EEE+0xa4f) [0x7f95e00be58f]
[bt] (4) /home/ubuntu/TuSimple-DUC/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN5mxnet4exec13GraphExecutor13InitArgumentsERKN4nnvm12IndexedGraphERKSt6vectorINS2_6TShapeESaIS7_EERKS6_IiSaIiEERKS6_INS_7ContextESaISG_EESK_SK_RKS6_INS_9OpReqTypeESaISL_EERKSt13unordered_setINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESt4hashISW_ESt8equal_toISW_ESaISW_EEPKNS_8ExecutorEPSt13unordered_mapISW_NS_7NDArrayESY_S10_SaISt4pairIKSW_S19_EEEPS6_IS19_SaIS19_EES1I_S1I_+0xaa4) [0x7f95e00c1f24]
[bt] (5) /home/ubuntu/TuSimple-DUC/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN5mxnet4exec13GraphExecutor4InitEN4nnvm6SymbolERKNS_7ContextERKSt3mapINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES4_St4lessISD_ESaISt4pairIKSD_S4_EEERKSt6vectorIS4_SaIS4_EESR_SR_RKSt13unordered_mapISD_NS2_6TShapeESt4hashISD_ESt8equal_toISD_ESaISG_ISH_ST_EEERKSS_ISD_iSV_SX_SaISG_ISH_iEEERKSN_INS_9OpReqTypeESaIS18_EERKSt13unordered_setISD_SV_SX_SaISD_EEPSN_INS_7NDArrayESaIS1I_EES1L_S1L_PSS_ISD_S1I_SV_SX_SaISG_ISH_S1I_EEEPNS_8ExecutorERKSS_INS2_9NodeEntryES1I_NS2_13NodeEntryHashENS2_14NodeEntryEqualESaISG_IKS1S_S1I_EEE+0x88b) [0x7f95e00c9edb]
[bt] (6) /home/ubuntu/TuSimple-DUC/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN5mxnet8Executor10SimpleBindEN4nnvm6SymbolERKNS_7ContextERKSt3mapINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES3_St4lessISC_ESaISt4pairIKSC_S3_EEERKSt6vectorIS3_SaIS3_EESQ_SQ_RKSt13unordered_mapISC_NS1_6TShapeESt4hashISC_ESt8equal_toISC_ESaISF_ISG_SS_EEERKSR_ISC_iSU_SW_SaISF_ISG_iEEERKSM_INS_9OpReqTypeESaIS17_EERKSt13unordered_setISC_SU_SW_SaISC_EEPSM_INS_7NDArrayESaIS1H_EES1K_S1K_PSR_ISC_S1H_SU_SW_SaISF_ISG_S1H_EEEPS0_+0x233) [0x7f95e00ca583]
[bt] (7) /home/ubuntu/TuSimple-DUC/mxnet/python/mxnet/../../lib/libmxnet.so(MXExecutorSimpleBind+0x2a67) [0x7f95e00888c7]
[bt] (8) /usr/lib/x86_64-linux-gnu/libffi.so.6(ffi_call_unix64+0x4c) [0x7f9603afae40]
[bt] (9) /usr/lib/x86_64-linux-gnu/libffi.so.6(ffi_call+0x2eb) [0x7f9603afa8ab]

Any idea what is causing it and how to fix it?

System: Ubuntu 16.04.3 mxnet 0.11.0 numpy 1.13.3 cv2 3.2.0 PIL 1.1.7 cython 0.27.1

Thanks!

chienyiwang commented 6 years ago

In the config.mk file, you should modify the line to be "USE_CUDA=1" before compiling.

wpqmanu commented 6 years ago

Please refer to this message:

src/storage/storage.cc:113: Compile with USE_CUDA=1 to enable GPU usage

It means MXNet is not compiled correctly so it cannot access the GPUs.

You can go to mxnet/make/config.mk to modify the configs as follows:

USE_CUDA = 1 
USE_CUDA_PATH = /usr/local/cuda 
USE_CUDNN = 1

README.md is also updated.

atward424 commented 6 years ago

That did it, thanks!