Closed iFighting closed 7 years ago
Install MXNet
will update the doc to be more clear about this forked MXNet and eventually remove this fork. sorry right now:)
i hava installed mxnet(from the official github) now i can run the demo and train the model what i want to say is that your code still has many bugs:
1.in the symbol_vgg.py, i just commeted if config.TEST.CXX_PROPOSAL,group = mx.symbol.Proposal, and use group = mx.symbol.Custom(). Is there any difference? 2.in the rcnn/core/loader.py, there is a bug,in the line, you should append if len(rowperm) > 0:,otherwise inds[:-extra] = np.reshape(inds[row_perm, :], (-1,)) will get error 3.in the demo.py code, you should add if len(keep) == 0: etc
a = np.zeros((0, 2))
b = []
c = a[b]
Thanks very much. Any ideas to correct them?
my numpy is too old? i know. all the bugs just from the blank indexing, the old numpy does not support it! thanks! can you speak chinese?
I think numpy 1.8.2 works with blank indexing.
when it begin train, i have met an error:
it happens when the model was training but not complete 1 epoch do you know why?
I do not know. What experiment are you doing now?
I have solved the problem. I just set a smaller value to the batchsize, no it works well.
another problem when i change the batchsize smaller in rcnn/config.py and train_end_to_end.py. i found that GPU used is not changed(seems always 3800MB around) I have changed the batchsize very small(config.TRAIN.BATCH_ROIS=16,config.TRAIN.RPN_BATCH_SIZE=32) do you know why?
No idea but we may have traded VGG memory usage for efficiency. You may want to try ResNet-101 which uses almost the same amount of memory.
Well, I installed the forked mxnet but failed to load the module mxnet.symbol.Proposal when running test.py. How do I load this op? (PS: i m a beginner of mxnet xD:)
git clone https://github.com/precedenceguo/mxnet.git --recursive -b simple
refer to http://mxnet.io/get_started/ubuntu_setup.html for the following. cd mxnet cp make/config.mk ./ echo "USE_CUDA=1" >>config.mk echo "USE_CUDA_PATH=/usr/local/cuda" >>config.mk echo "USE_CUDNN=1" >>config.mk make -j($proc) cd python python setup.py install --user
this it the log: [23:58:53] /home/img/jiangyi/test/mxnet/dmlc-core/include/dmlc/logging.h:235: [23:58:53] src/operator/custom.cc:79: Check failed: opinfo->forward(ptrs.size(), ptrs.data(), tags.data(), reqs.data(), ctx.is_train, opinfo->p_forward) 1273 [23:58:53] /home/img/jiangyi/test/mxnet/dmlc-core/include/dmlc/logging.h:235: [23:58:53] src/operator/custom.cc:79: Check failed: opinfo->forward(ptrs.size(), ptrs.data(), tags.data(), reqs.data(), ctx.is_train, opinfo->p_forward) 1274 [23:58:53] /home/img/jiangyi/test/mxnet/dmlc-core/include/dmlc/logging.h:235: [23:58:53] src/engine/./threaded_engine.h:306: [23:58:53] src/operator/custom.cc:79: Check failed: opinfo- >forward(ptrs.size(), ptrs.data(), tags.data(), reqs.data(), ctx.is_train, opinfo->p_forward) 1275 An fatal error occurred in asynchronous engine operation. If you do not know what caused this error, you can try set environment variable MXNET_ENGINE_TYPE to NaiveEngine and run with de bugger (i.e. gdb). This will force all operations to be synchronous and backtrace will give you the series of calls that lead to this error. Remember to set MXNET_ENGINE_TYPE back to emp ty after debugging. 1276 terminate called after throwing an instance of 'dmlc::Error' 1277 [23:58:53] /home/img/jiangyi/test/mxnet/dmlc-core/include/dmlc/logging.h:235: what(): [23:58:53] src/engine/./threaded_engine.h:306: [23:58:53] src/operator/custom.cc:79: Check failed : opinfo->forward(ptrs.size(), ptrs.data(), tags.data(), reqs.data(), ctx.is_train, opinfo->p_forward) 1278 An fatal error occurred in asynchronous engine operation. If you do not know what caused this error, you can try set environment variable MXNET_ENGINE_TYPE to NaiveEngine and run with de bugger (i.e. gdb). This will force all operations to be synchronous and backtrace will give you the series of calls that lead to this error. Remember to set MXNET_ENGINE_TYPE back to emp ty after debugging. 1279 [23:58:53] src/engine/./threaded_engine.h:306: [23:58:53] src/operator/custom.cc:79: Check failed: opinfo->forward(ptrs.size(), ptrs.data(), tags.data(), reqs.data(), ctx.istrain, op info_->p_forward) 1280 An fatal error occurred in asynchronous engine operation. If you do not know what caused this error, you can try set environment variable MXNET_ENGINE_TYPE to NaiveEngine and run with de bugger (i.e. gdb). This will force all operations to be synchronous and backtrace will give you the series of calls that lead to this error. Remember to set MXNET_ENGINE_TYPE back to emp ty after debugging. 1281 terminate called recursively
export MXNET_ENGINE_TYPE=NaiveEngine see what caused the error
when i train my data, it has this error: it is ok in py-faster-rcnn it seem that ex_heights is 0, should i modify the bbox_transform.py?
Can this be repeated?
yes, i have met this error in mx-rcnn several times,while in py-faster-rcnn,i have trained several times correctly i found that in bbox_transform.py of mx-rcnn, the code is not the same with bbox_transform.py in py-faster-rcnn so i think there maybe have a bug?
So after this change did you encounter this error any more?
I trained my own data with resnet101, and encountered this error: PS: I have not update mx-rcnn since 18 Jan 2017.
The error is fixed. Some arguments in the config.py is not set correctly
@breeze5428 What did you change in config.py??
Even i have got the same error. Please help
i just run the demo,it show the error: Traceback (most recent call last): File "demo.py", line 139, in
main()
File "demo.py", line 117, in main
symbol = get_vgg_test()
File "/home/mx-rcnn/rcnn/symbol/symbol_vgg.py", line 276, in get_vgg_test
rois = mx.symbol.Proposal(
AttributeError: 'module' object has no attribute 'Proposal'