msracver / FCIS

Fully Convolutional Instance-aware Semantic Segmentation
MIT License
1.57k stars 414 forks source link

demo - error #10

Open oneOfThePeople opened 7 years ago

oneOfThePeople commented 7 years ago

hi, i run python ./fcis/demo.py and get this error

Traceback (most recent call last): File "./fcis/demo.py", line 147, in main() File "./fcis/demo.py", line 43, in main sym = sym_instance.get_symbol(config, is_train=False) File "/home/boston_lea/AutoMap/FCIS/fcis/symbols/resnet_v1_101_fcis.py", line 799, in get_symbol psroipool_cls_seg = mx.contrib.sym.PSROIPooling(name='psroipool_cls_seg', data=fcis_cls_seg, rois=rois, AttributeError: 'module' object has no attribute 'PSROIPooling'

any idea?

liyi14 commented 7 years ago

Hi, have you

Copy operators in ./fcis/operator_cxx to $(YOUR_MXNET_FOLDER)/src/operator/contrib and recompile MXNet.

oneOfThePeople commented 7 years ago

yes, but then i understand that i need to copy the files and not the directory something like this cp /fcis/operator_cxx/* to $(YOUR_MXNET_FOLDER)/src/operator/contrib now i have something that connect to this while i run the train i get this message...

src/operator/convolution.cu:87: This convolution is not supported by cudnn, MXNET convolution is applied

the train is working ,so it is a problem? thank you

liyi14 commented 7 years ago

Hi, @phexic , have you updated your mxnet? mxnet_op.h

phexic commented 7 years ago

Before copy files in fcis. mxnet runs successfully. but mxnet cannot run well while recompiling mxnet with new files copied from fcis.

wangg12 commented 7 years ago

@phexic I followed the method of @oneOfThePeople , it worked fluently.

phexic commented 7 years ago

@wangg12 thank for your help. however, i met the problem in recompiling mxnet and i am trying to find the causes.

wangg12 commented 7 years ago

@phexic Have you located your compile error? Did you do make clean before recompiling?

wangg12 commented 7 years ago

@phexic Perhaps you have to git clone the original mxnet by

git clone mxnet --recursive
phexic commented 7 years ago

@wangg12 I am very appreciate for your suggestions and i will try git clone mxnet --recursive again.

wangg12 commented 7 years ago

@phexic Do you have more detailed error message?

yelantf commented 7 years ago

Hi. I have the same error. But I did copy those files(not directory) to $(YOUR_MXNET_FOLDER)/src/operator/contrib, this error still appears. Did I need set some special options when I recompile it ? I run this after I copied the files: make -j $(nproc) USE_OPENCV=1 USE_BLAS=openblas USE_CUDA=1 USE_CUDA_PATH=/usr/local/cuda USE_CUDNN=1 And this is what I get when I run python ./fcis/demo.py:

Traceback (most recent call last):
  File "./fcis/demo.py", line 147, in <module>
    main()
  File "./fcis/demo.py", line 43, in main
    sym = sym_instance.get_symbol(config, is_train=False)
  File "/home/yelantf/FCIS/fcis/symbols/resnet_v1_101_fcis.py", line 799, in get_symbol
    psroipool_cls_seg = mx.contrib.sym.PSROIPooling(name='psroipool_cls_seg', data=fcis_cls_seg, rois=rois,
AttributeError: 'module' object has no attribute 'PSROIPooling''
phexic commented 7 years ago

@wangg12 Error occurs reading network parameters and no more detailed information.

realwecan commented 7 years ago

I managed to resolve OP's problem by first copying over the operators, then recompiling mxnet as well as the python bindings.

For the convolution not supported by cudnn problem, it may help to make sure you have the correct/appropriate version of cudnn installed. If not, reinstall and recompile mxnet.

mariolew commented 7 years ago

@realwecan I'm using CUDA-7.5 and cudnn 5.0, but I cannot use cudnn for convolution.... I don't know which version of cudnn should I use....

dpengwen commented 7 years ago

Hi, I followed the method of @liyi14, but it still didn't work, I copied the file to $(YOUR_MXNET_FOLDER)/src/operator/contrib and recompiled mxnet successfully, I also find the generated file .so, .d in "$(YOUR_MXNET_FOLDER)/build/src/operator/", but it still shows the error:

File "./fcis/demo.py", line 43, in main sym = sym_instance.get_symbol(config, is_train=False) File "/home/yelantf/FCIS/fcis/symbols/resnet_v1_101_fcis.py", line 799, in get_symbol psroipool_cls_seg = mx.contrib.sym.PSROIPooling(name='psroipool_cls_seg', data=fcis_cls_seg, rois=rois, AttributeError: 'module' object has no attribute 'PSROIPooling''

Could anyone help me ?

Thanks.

lc8631058 commented 7 years ago

@oneOfThePeople @dpengwen @phexic @yelantingfeng @wangg12 Hi, I followed the mxnet installation and build mxnet from source, first I got this error:module 'mxnet' has no attribute 'mx.__file__', and then I usepip install mxnet, the error disappeared, but I got the same error like you:

    sym = sym_instance.get_symbol(config, is_train=False)
  File "/home/carnd/Semantic_segmentation/FCIS/fcis/symbols/resnet_v1_101_fcis.py", line 799, in get_symbol
    psroipool_cls_seg = mx.contrib.sym.PSROIPooling(name='psroipool_cls_seg', data=fcis_cls_seg, rois=rois,
AttributeError: module 'mxnet.contrib.symbol' has no attribute 'PSROIPooling'

and I've done this manually cp /fcis/operator_cxx/* to $(YOUR_MXNET_FOLDER)/src/operator/contrib, I'm not sure how to recompile mxnet, what I did is run make clean under python3.5/site-packages/mxnet and run this again: make -j $(nproc) USE_OPENCV=1 USE_BLAS=openblas USE_CUDA=1 USE_CUDA_PATH=/usr/local/cuda USE_CUDNN=1, but still doesn't work.

lc8631058 commented 7 years ago

@realwecan hi, I run make cleanand compile mxnet again use: make -j $(nproc) USE_OPENCV=1 USE_BLAS=openblas USE_CUDA=1 USE_CUDA_PATH=/usr/local/cuda USE_CUDNN=1, and setup python binding again use sudo python setup.py install, but it still told me: module 'mxnet.contrib.symbol' has no attribute 'PSROIPooling', do you know what's wrong with that?

lc8631058 commented 7 years ago

problem above has been solved, new error: src/operator/convolution.cu:87: This convolution is not supported by cudnn, MXNET convolution is applied.

Michiemi commented 7 years ago

@lc8631058 Hi, how did you solve the problem? I tried all the suggested solutions here, but I still get this error: AttributeError: 'module' object has no attribute 'PSROIPooling'

Thanks in advance.

lc8631058 commented 7 years ago

@Michiemi Hi, I just followed this reply:

make sure you have compiled MXNET by yourself (not from pip), and copied these files before the MXNET compilation. For example:

Clone FCIS and MXNET repos as described in readme Run bash init.sh in FCIS dir Copy operators: cp ${YOUR_FCIS_ROOT}/fcis/operator_cxx/* ${YOUR_MXNET_ROOT}/src/operator/contrib/ Compile MXNET as described in their instructions

and this guide from mxnet, it teaches you how to build from scratch and how to make python binding

xingbowei commented 7 years ago

If you have time, please help me with this question. Thank you!

Traceback (most recent call last): File "demo.py", line 151, in main() File "demo.py", line 82, in main arg_params=arg_params, aux_params=aux_params) File "/home/xbw/FCIS/fcis/core/tester.py", line 30, in init self._mod.bind(provide_data, provide_label, for_training=False) File "/home/xbw/FCIS/fcis/core/module.py", line 840, in bind for_training, inputs_need_grad, force_rebind=False, shared_module=None) File "/home/xbw/FCIS/fcis/core/module.py", line 397, in bind state_names=self._state_names) File "/home/xbw/FCIS/fcis/core/DataParallelExecutorGroup.py", line 178, in init self.bind_exec(data_shapes, label_shapes, shared_group) File "/home/xbw/FCIS/fcis/core/DataParallelExecutorGroup.py", line 278, in bind_exec shared_group)) File "/home/xbw/FCIS/fcis/core/DataParallelExecutorGroup.py", line 592, in _bind_ith_exec context, self.logger) File "/home/xbw/FCIS/fcis/core/DataParallelExecutorGroup.py", line 570, in _get_or_reshape arg_arr = nd.zeros(arg_shape, context, dtype=arg_type) File "/home/xbw/mxnet/python/mxnet/ndarray.py", line 1047, in zeros return _internal._zeros(shape=shape, ctx=ctx, dtype=dtype, **kwargs) File "", line 15, in _zeros File "/home/xbw/mxnet/python/mxnet/_ctypes/ndarray.py", line 72, in _imperative_invoke c_array(ctypes.c_char_p, [c_str(str(val)) for val in vals]))) File "/home/xbw/mxnet/python/mxnet/base.py", line 85, in check_call raise MXNetError(py_str(_LIB.MXGetLastError())) mxnet.base.MXNetError: [16:12:45] src/c_api/c_api_ndarray.cc:390: Operator _zeros is not implemented for GPU.

jiqirenno1 commented 7 years ago

hi , have you solved the problem?

Traceback (most recent call last): File "demo.py", line 151, in main() File "demo.py", line 82, in main arg_params=arg_params, aux_params=aux_params) File "/home/xbw/FCIS/fcis/core/tester.py", line 30, in init self._mod.bind(provide_data, provide_label, for_training=False) File "/home/xbw/FCIS/fcis/core/module.py", line 840, in bind for_training, inputs_need_grad, force_rebind=False, shared_module=None) File "/home/xbw/FCIS/fcis/core/module.py", line 397, in bind state_names=self._state_names) File "/home/xbw/FCIS/fcis/core/DataParallelExecutorGroup.py", line 178, in init self.bind_exec(data_shapes, label_shapes, shared_group) File "/home/xbw/FCIS/fcis/core/DataParallelExecutorGroup.py", line 278, in bind_exec shared_group)) File "/home/xbw/FCIS/fcis/core/DataParallelExecutorGroup.py", line 592, in _bind_ith_exec context, self.logger) File "/home/xbw/FCIS/fcis/core/DataParallelExecutorGroup.py", line 570, in _get_or_reshape arg_arr = nd.zeros(arg_shape, context, dtype=arg_type) File "/home/xbw/mxnet/python/mxnet/ndarray.py", line 1047, in zeros return _internal._zeros(shape=shape, ctx=ctx, dtype=dtype, **kwargs) File "", line 15, in _zeros File "/home/xbw/mxnet/python/mxnet/_ctypes/ndarray.py", line 72, in _imperative_invoke c_array(ctypes.c_char_p, [c_str(str(val)) for val in vals]))) File "/home/xbw/mxnet/python/mxnet/base.py", line 85, in check_call raise MXNetError(py_str(_LIB.MXGetLastError())) mxnet.base.MXNetError: [16:12:45] src/c_api/c_api_ndarray.cc:390: Operator _zeros is not implemented for GPU.

lnuchiyo commented 7 years ago

@oneOfThePeople @liyi14 @phexic @wangg12 @yelantingfeng excuse me,i run demo.py ,there is a error: Traceback (most recent call last): File "demo.py", line 29, in from core.tester import im_detect, Predictor File "/home/cs/FCIS/fcis/core/tester.py", line 18, in from nms.nms import py_nms_wrapper File "/home/cs/FCIS/fcis/../lib/nms/nms.py", line 3, in from cpu_nms import cpu_nms ImportError: No module named cpu_nms ####################### and i try to change /FCIS/lib/nms/nms.py to delete cpu but there is a new error: Traceback (most recent call last): File "demo.py", line 29, in from core.tester import im_detect, Predictor File "/home/cs/FCIS/fcis/core/tester.py", line 18, in from nms.nms import py_nms_wrapper File "/home/cs/FCIS/fcis/../lib/nms/nms.py", line 3, in from gpu_nms import gpu_nms ImportError: No module named gpu_nms

i think it need gpu_nms, how can i deal with this error? thans a lot for any suggestion.

pxiangwu commented 7 years ago

@Michiemi Hi, just a suggestion.

After recompiling mxnet, you should also rebuild the python binding. But before you rebuild it, you'd better delete the previously built mxnet .egg file in the python site-packages (or dist-packages). Then you just rebuild the python binding: sudo python setup.py install

yangyu12 commented 7 years ago

@lc8631058 hello, I wanna know have you solved the error src/operator/convolution.cu:119: This convolution is not supported by cudnn, MXNET convolution is applied. cuz the same problem came to me. Although this information is printed, the demo works and output the results successfully. I wanna know the reason and how to fix it.

wangg12 commented 7 years ago

@yangyu12 It is because dilated conv is not supported by your current cudnn and mxnet uses its own implementation.

You can ignore this problem or you can kill this warning by commenting this line src/operator/convolution.cu:119 and re-compiling mxnet.

yangyu12 commented 7 years ago

@wangg12 thx for help

wenyaole commented 6 years ago

I am running demo.py, and has a problem:

Traceback (most recent call last): File "demo.py", line 151, in main() File "demo.py", line 43, in main sym = sym_instance.get_symbol(config, is_train=False) File "/home/wenyaole/FCIS/fcis/symbols/resnet_v1_101_fcis.py", line 816, in get_symbol psroipool_cls = mx.contrib.sym.ChannelOperator(name='psroipool_cls', data=psroipool_cls_seg, group=num_classes, op_type='Group_Max') AttributeError: 'module' object has no attribute 'ChannelOperator'

so how could I do, please help me, thank you very much!!!

cinly0 commented 6 years ago

I am running demo.py, and has a problem: {'BINARY_THRESH': 0.4, 'CLASS_AGNOSTIC': True, 'MASK_SIZE': 21, 'MXNET_VERSION': 'mxnet', 'SCALES': [(600, 1000)], 'TEST': {'BATCH_IMAGES': 1, 'CXX_PROPOSAL': False, 'HAS_RPN': True, 'ITER': 2, 'MASK_MERGE_THRESH': 0.5, 'MIN_DROP_SIZE': 2, 'NMS': 0.3, 'PROPOSAL_MIN_SIZE': 2, 'PROPOSAL_NMS_THRESH': 0.7, 'PROPOSAL_POST_NMS_TOP_N': 2000, 'PROPOSAL_PRE_NMS_TOP_N': 20000, 'RPN_MIN_SIZE': 2, 'RPN_NMS_THRESH': 0.7, 'RPN_POST_NMS_TOP_N': 300, 'RPN_PRE_NMS_TOP_N': 6000, 'USE_GPU_MASK_MERGE': True, 'USE_MASK_MERGE': True, 'test_epoch': 8}, 'TRAIN': {'ASPECT_GROUPING': True, 'BATCH_IMAGES': 1, 'BATCH_ROIS': -1, 'BATCH_ROIS_OHEM': 128, 'BBOX_MEANS': [0.0, 0.0, 0.0, 0.0], 'BBOX_NORMALIZATION_PRECOMPUTED': True, 'BBOX_REGRESSION_THRESH': 0.5, 'BBOX_STDS': [0.2, 0.2, 0.5, 0.5], 'BBOX_WEIGHTS': array([1., 1., 1., 1.]), 'BG_THRESH_HI': 0.5, 'BG_THRESH_LO': 0, 'BINARY_THRESH': 0.4, 'CONVNEW3': True, 'CXX_PROPOSAL': False, 'ENABLE_OHEM': True, 'END2END': True, 'FG_FRACTION': 0.25, 'FG_THRESH': 0.5, 'FLIP': True, 'GAP_SELECT_FROM_ALL': False, 'IGNORE_GAP': False, 'LOSS_WEIGHT': [1.0, 10.0, 1.0], 'RESUME': False, 'RPN_ALLOWED_BORDER': 0, 'RPN_BATCH_SIZE': 256, 'RPN_BBOX_WEIGHTS': [1.0, 1.0, 1.0, 1.0], 'RPN_CLOBBER_POSITIVES': False, 'RPN_FG_FRACTION': 0.5, 'RPN_MIN_SIZE': 2, 'RPN_NEGATIVE_OVERLAP': 0.3, 'RPN_NMS_THRESH': 0.7, 'RPN_POSITIVE_OVERLAP': 0.7, 'RPN_POSITIVE_WEIGHT': -1.0, 'RPN_POST_NMS_TOP_N': 300, 'RPN_PRE_NMS_TOP_N': 6000, 'SHUFFLE': True, 'begin_epoch': 0, 'end_epoch': 8, 'lr': 0.0005, 'lr_step': '5.33', 'model_prefix': 'e2e', 'momentum': 0.9, 'warmup': True, 'warmup_lr': 5e-05, 'warmup_step': 250, 'wd': 0.0005}, 'dataset': {'NUM_CLASSES': 81, 'dataset': 'coco', 'dataset_path': './data/coco', 'image_set': 'train2014+valminusminival2014', 'proposal': 'rpn', 'root_path': './data', 'test_image_set': 'test-dev2015'}, 'default': {'frequent': 20, 'kvstore': 'device'}, 'gpus': '0', 'network': {'ANCHOR_RATIOS': [0.5, 1, 2], 'ANCHOR_SCALES': [4, 8, 16, 32], 'FIXED_PARAMS': ['conv1', 'bn_conv1', 'res2', 'bn2', 'gamma', 'beta'], 'FIXED_PARAMS_SHARED': ['conv1', 'bn_conv1', 'res2', 'bn2', 'res3', 'bn3', 'res4', 'bn4', 'gamma', 'beta'], 'IMAGE_STRIDE': 0, 'NUM_ANCHORS': 12, 'PIXEL_MEANS': array([103.06, 115.9 , 123.15]), 'RCNN_FEAT_STRIDE': 16, 'RPN_FEAT_STRIDE': 16, 'pretrained': './model/pretrained_model/resnet_v1_101', 'pretrained_epoch': 0}, 'output_path': '../output/fcis', 'symbol': 'resnet_v1_101_fcis'} Traceback (most recent call last): File "demo.py", line 149, in main() File "demo.py", line 41, in main sym = sym_instance.get_symbol(config, is_train=False) File "/home/lxl/FCIS/fcis/symbols/resnet_v1_101_fcis.py", line 817, in get_symbol psroipool_cls = mx.contrib.sym.ChannelOperator(name='psroipool_cls', data=psroipool_cls_seg, group=num_classes, op_type='Group_Max') AttributeError: 'module' object has no attribute 'ChannelOperator'

what can i do ?please help me ,thx very much