Open oneOfThePeople opened 7 years ago
Hi, have you
Copy operators in ./fcis/operator_cxx to $(YOUR_MXNET_FOLDER)/src/operator/contrib and recompile MXNet.
yes, but then i understand that i need to copy the files and not the directory
something like this cp /fcis/operator_cxx/* to $(YOUR_MXNET_FOLDER)/src/operator/contrib
now i have something that connect to this while i run the train
i get this message...
src/operator/convolution.cu:87: This convolution is not supported by cudnn, MXNET convolution is applied
the train is working ,so it is a problem? thank you
Hi, @phexic , have you updated your mxnet? mxnet_op.h
Before copy files in fcis. mxnet runs successfully. but mxnet cannot run well while recompiling mxnet with new files copied from fcis.
@phexic I followed the method of @oneOfThePeople , it worked fluently.
@wangg12 thank for your help. however, i met the problem in recompiling mxnet and i am trying to find the causes.
@phexic Have you located your compile error? Did you do make clean
before recompiling?
@phexic Perhaps you have to git clone the original mxnet by
git clone mxnet --recursive
@wangg12 I am very appreciate for your suggestions and i will try git clone mxnet --recursive again.
@phexic Do you have more detailed error message?
Hi. I have the same error. But I did copy those files(not directory) to $(YOUR_MXNET_FOLDER)/src/operator/contrib
, this error still appears.
Did I need set some special options when I recompile it ? I run this after I copied the files:
make -j $(nproc) USE_OPENCV=1 USE_BLAS=openblas USE_CUDA=1 USE_CUDA_PATH=/usr/local/cuda USE_CUDNN=1
And this is what I get when I run python ./fcis/demo.py
:
Traceback (most recent call last):
File "./fcis/demo.py", line 147, in <module>
main()
File "./fcis/demo.py", line 43, in main
sym = sym_instance.get_symbol(config, is_train=False)
File "/home/yelantf/FCIS/fcis/symbols/resnet_v1_101_fcis.py", line 799, in get_symbol
psroipool_cls_seg = mx.contrib.sym.PSROIPooling(name='psroipool_cls_seg', data=fcis_cls_seg, rois=rois,
AttributeError: 'module' object has no attribute 'PSROIPooling''
@wangg12 Error occurs reading network parameters and no more detailed information.
I managed to resolve OP's problem by first copying over the operators, then recompiling mxnet as well as the python bindings.
For the convolution not supported by cudnn problem, it may help to make sure you have the correct/appropriate version of cudnn installed. If not, reinstall and recompile mxnet.
@realwecan I'm using CUDA-7.5 and cudnn 5.0, but I cannot use cudnn for convolution.... I don't know which version of cudnn should I use....
Hi, I followed the method of @liyi14, but it still didn't work, I copied the file to $(YOUR_MXNET_FOLDER)/src/operator/contrib and recompiled mxnet successfully, I also find the generated file .so, .d in "$(YOUR_MXNET_FOLDER)/build/src/operator/", but it still shows the error:
File "./fcis/demo.py", line 43, in main sym = sym_instance.get_symbol(config, is_train=False) File "/home/yelantf/FCIS/fcis/symbols/resnet_v1_101_fcis.py", line 799, in get_symbol psroipool_cls_seg = mx.contrib.sym.PSROIPooling(name='psroipool_cls_seg', data=fcis_cls_seg, rois=rois, AttributeError: 'module' object has no attribute 'PSROIPooling''
Could anyone help me ?
Thanks.
@oneOfThePeople @dpengwen @phexic @yelantingfeng @wangg12 Hi, I followed the mxnet installation and build mxnet from source, first I got this error:module 'mxnet' has no attribute 'mx.__file__'
, and then I usepip install mxnet
, the error disappeared, but I got the same error like you:
sym = sym_instance.get_symbol(config, is_train=False)
File "/home/carnd/Semantic_segmentation/FCIS/fcis/symbols/resnet_v1_101_fcis.py", line 799, in get_symbol
psroipool_cls_seg = mx.contrib.sym.PSROIPooling(name='psroipool_cls_seg', data=fcis_cls_seg, rois=rois,
AttributeError: module 'mxnet.contrib.symbol' has no attribute 'PSROIPooling'
and I've done this manually cp /fcis/operator_cxx/* to $(YOUR_MXNET_FOLDER)/src/operator/contrib
, I'm not sure how to recompile mxnet, what I did is run make clean
under python3.5/site-packages/mxnet
and run this again: make -j $(nproc) USE_OPENCV=1 USE_BLAS=openblas USE_CUDA=1 USE_CUDA_PATH=/usr/local/cuda USE_CUDNN=1
, but still doesn't work.
@realwecan hi, I run make clean
and compile mxnet again use: make -j $(nproc) USE_OPENCV=1 USE_BLAS=openblas USE_CUDA=1 USE_CUDA_PATH=/usr/local/cuda USE_CUDNN=1
, and setup python binding again use sudo python setup.py install
, but it still told me: module 'mxnet.contrib.symbol' has no attribute 'PSROIPooling'
, do you know what's wrong with that?
problem above has been solved, new error:
src/operator/convolution.cu:87: This convolution is not supported by cudnn, MXNET convolution is applied.
@lc8631058 Hi, how did you solve the problem? I tried all the suggested solutions here, but I still get this error: AttributeError: 'module' object has no attribute 'PSROIPooling'
Thanks in advance.
@Michiemi Hi, I just followed this reply:
make sure you have compiled MXNET by yourself (not from pip), and copied these files before the MXNET compilation. For example:
Clone FCIS and MXNET repos as described in readme Run bash init.sh in FCIS dir Copy operators: cp ${YOUR_FCIS_ROOT}/fcis/operator_cxx/* ${YOUR_MXNET_ROOT}/src/operator/contrib/ Compile MXNET as described in their instructions
and this guide from mxnet, it teaches you how to build from scratch and how to make python binding
If you have time, please help me with this question. Thank you!
Traceback (most recent call last):
File "demo.py", line 151, in
hi , have you solved the problem?
Traceback (most recent call last): File "demo.py", line 151, in main() File "demo.py", line 82, in main arg_params=arg_params, aux_params=aux_params) File "/home/xbw/FCIS/fcis/core/tester.py", line 30, in init self._mod.bind(provide_data, provide_label, for_training=False) File "/home/xbw/FCIS/fcis/core/module.py", line 840, in bind for_training, inputs_need_grad, force_rebind=False, shared_module=None) File "/home/xbw/FCIS/fcis/core/module.py", line 397, in bind state_names=self._state_names) File "/home/xbw/FCIS/fcis/core/DataParallelExecutorGroup.py", line 178, in init self.bind_exec(data_shapes, label_shapes, shared_group) File "/home/xbw/FCIS/fcis/core/DataParallelExecutorGroup.py", line 278, in bind_exec shared_group)) File "/home/xbw/FCIS/fcis/core/DataParallelExecutorGroup.py", line 592, in _bind_ith_exec context, self.logger) File "/home/xbw/FCIS/fcis/core/DataParallelExecutorGroup.py", line 570, in _get_or_reshape arg_arr = nd.zeros(arg_shape, context, dtype=arg_type) File "/home/xbw/mxnet/python/mxnet/ndarray.py", line 1047, in zeros return _internal._zeros(shape=shape, ctx=ctx, dtype=dtype, **kwargs) File "", line 15, in _zeros File "/home/xbw/mxnet/python/mxnet/_ctypes/ndarray.py", line 72, in _imperative_invoke c_array(ctypes.c_char_p, [c_str(str(val)) for val in vals]))) File "/home/xbw/mxnet/python/mxnet/base.py", line 85, in check_call raise MXNetError(py_str(_LIB.MXGetLastError())) mxnet.base.MXNetError: [16:12:45] src/c_api/c_api_ndarray.cc:390: Operator _zeros is not implemented for GPU.
@oneOfThePeople @liyi14 @phexic @wangg12 @yelantingfeng
excuse me,i run demo.py ,there is a error:
Traceback (most recent call last):
File "demo.py", line 29, in
i think it need gpu_nms, how can i deal with this error? thans a lot for any suggestion.
@Michiemi Hi, just a suggestion.
After recompiling mxnet, you should also rebuild the python binding. But before you rebuild it, you'd better delete the previously built mxnet .egg file in the python site-packages (or dist-packages). Then you just rebuild the python binding: sudo python setup.py install
@lc8631058 hello, I wanna know have you solved the error
src/operator/convolution.cu:119: This convolution is not supported by cudnn, MXNET convolution is applied.
cuz the same problem came to me. Although this information is printed, the demo works and output the results successfully. I wanna know the reason and how to fix it.
@yangyu12 It is because dilated conv is not supported by your current cudnn and mxnet uses its own implementation.
You can ignore this problem or you can kill this warning by commenting this line src/operator/convolution.cu:119
and re-compiling mxnet.
@wangg12 thx for help
I am running demo.py, and has a problem:
Traceback (most recent call last):
File "demo.py", line 151, in
so how could I do, please help me, thank you very much!!!
I am running demo.py, and has a problem:
{'BINARY_THRESH': 0.4,
'CLASS_AGNOSTIC': True,
'MASK_SIZE': 21,
'MXNET_VERSION': 'mxnet',
'SCALES': [(600, 1000)],
'TEST': {'BATCH_IMAGES': 1,
'CXX_PROPOSAL': False,
'HAS_RPN': True,
'ITER': 2,
'MASK_MERGE_THRESH': 0.5,
'MIN_DROP_SIZE': 2,
'NMS': 0.3,
'PROPOSAL_MIN_SIZE': 2,
'PROPOSAL_NMS_THRESH': 0.7,
'PROPOSAL_POST_NMS_TOP_N': 2000,
'PROPOSAL_PRE_NMS_TOP_N': 20000,
'RPN_MIN_SIZE': 2,
'RPN_NMS_THRESH': 0.7,
'RPN_POST_NMS_TOP_N': 300,
'RPN_PRE_NMS_TOP_N': 6000,
'USE_GPU_MASK_MERGE': True,
'USE_MASK_MERGE': True,
'test_epoch': 8},
'TRAIN': {'ASPECT_GROUPING': True,
'BATCH_IMAGES': 1,
'BATCH_ROIS': -1,
'BATCH_ROIS_OHEM': 128,
'BBOX_MEANS': [0.0, 0.0, 0.0, 0.0],
'BBOX_NORMALIZATION_PRECOMPUTED': True,
'BBOX_REGRESSION_THRESH': 0.5,
'BBOX_STDS': [0.2, 0.2, 0.5, 0.5],
'BBOX_WEIGHTS': array([1., 1., 1., 1.]),
'BG_THRESH_HI': 0.5,
'BG_THRESH_LO': 0,
'BINARY_THRESH': 0.4,
'CONVNEW3': True,
'CXX_PROPOSAL': False,
'ENABLE_OHEM': True,
'END2END': True,
'FG_FRACTION': 0.25,
'FG_THRESH': 0.5,
'FLIP': True,
'GAP_SELECT_FROM_ALL': False,
'IGNORE_GAP': False,
'LOSS_WEIGHT': [1.0, 10.0, 1.0],
'RESUME': False,
'RPN_ALLOWED_BORDER': 0,
'RPN_BATCH_SIZE': 256,
'RPN_BBOX_WEIGHTS': [1.0, 1.0, 1.0, 1.0],
'RPN_CLOBBER_POSITIVES': False,
'RPN_FG_FRACTION': 0.5,
'RPN_MIN_SIZE': 2,
'RPN_NEGATIVE_OVERLAP': 0.3,
'RPN_NMS_THRESH': 0.7,
'RPN_POSITIVE_OVERLAP': 0.7,
'RPN_POSITIVE_WEIGHT': -1.0,
'RPN_POST_NMS_TOP_N': 300,
'RPN_PRE_NMS_TOP_N': 6000,
'SHUFFLE': True,
'begin_epoch': 0,
'end_epoch': 8,
'lr': 0.0005,
'lr_step': '5.33',
'model_prefix': 'e2e',
'momentum': 0.9,
'warmup': True,
'warmup_lr': 5e-05,
'warmup_step': 250,
'wd': 0.0005},
'dataset': {'NUM_CLASSES': 81,
'dataset': 'coco',
'dataset_path': './data/coco',
'image_set': 'train2014+valminusminival2014',
'proposal': 'rpn',
'root_path': './data',
'test_image_set': 'test-dev2015'},
'default': {'frequent': 20, 'kvstore': 'device'},
'gpus': '0',
'network': {'ANCHOR_RATIOS': [0.5, 1, 2],
'ANCHOR_SCALES': [4, 8, 16, 32],
'FIXED_PARAMS': ['conv1',
'bn_conv1',
'res2',
'bn2',
'gamma',
'beta'],
'FIXED_PARAMS_SHARED': ['conv1',
'bn_conv1',
'res2',
'bn2',
'res3',
'bn3',
'res4',
'bn4',
'gamma',
'beta'],
'IMAGE_STRIDE': 0,
'NUM_ANCHORS': 12,
'PIXEL_MEANS': array([103.06, 115.9 , 123.15]),
'RCNN_FEAT_STRIDE': 16,
'RPN_FEAT_STRIDE': 16,
'pretrained': './model/pretrained_model/resnet_v1_101',
'pretrained_epoch': 0},
'output_path': '../output/fcis',
'symbol': 'resnet_v1_101_fcis'}
Traceback (most recent call last):
File "demo.py", line 149, in
what can i do ?please help me ,thx very much
hi, i run
python ./fcis/demo.py
and get this errorTraceback (most recent call last): File "./fcis/demo.py", line 147, in
main()
File "./fcis/demo.py", line 43, in main
sym = sym_instance.get_symbol(config, is_train=False)
File "/home/boston_lea/AutoMap/FCIS/fcis/symbols/resnet_v1_101_fcis.py", line 799, in get_symbol
psroipool_cls_seg = mx.contrib.sym.PSROIPooling(name='psroipool_cls_seg', data=fcis_cls_seg, rois=rois,
AttributeError: 'module' object has no attribute 'PSROIPooling'
any idea?