Closed sunilmallya-work closed 6 years ago
do you get the same result with --gpu 0? 1st gpu is numbered 0
Your GPU memory is not enough for the demo.
From your error log:
mxnet.base.MXNetError: [12:01:59] src/storage/./pooled_storage_manager.h:84: cudaMalloc failed: out of memory
The GPU memory is not enough
I wasn't able to reproduce this error on a p2.xl. I doubt the image is the issue, but would you mind uploading it? I was able to get correct output:
class ---- [[x1, x2, y1, y2, confidence]] --------- bicycle --------- [[ 14.01706886 96.79859924 449.25 334.8694458 0.99867541]] results saved to bike_result.jpeg
This issue is closed due to lack of activity in the last 90 days. Feel free to ping me to reopen if this is still an active issue. Thanks!
code works while running on CPU
For bugs or installation issues, please provide the following information. The more information you provide, the more likely people will be able to help you.
Environment info
Operating System: ubuntu 14.04 ; on AWS p2.8xl
Compiler: gcc
Package used (Python/R/Scala/Julia): python
MXNet version: 0.9.5
Or if installed from source:
MXNet commit hash (
git rev-parse HEAD
):If you are using python package, please provide
Python version and distribution:
If you are using R package, please provide
R
sessionInfo()
:Error Message:
Please paste the full error message, including stack trace.
ubuntu@ip-172-31-36-165:~/mxnet/example/rcnn$ python demo.py --prefix final --epoch 0 --image bike.jpg --gpu 1 [12:01:58] src/engine/engine.cc:36: MXNet start using engine: NaiveEngine [12:01:59] /home/ubuntu/mxnet/dmlc-core/include/dmlc/logging.h:300: [12:01:59] src/storage/./pooled_storage_manager.h:84: cudaMalloc failed: out of memory
Stack trace returned 10 entries: [bt] (0) /usr/local/lib/python2.7/dist-packages/mxnet-0.9.5-py2.7.egg/mxnet/libmxnet.so(_ZN4dmlc15LogMessageFatalD1Ev+0x3c) [0x7ff446af368c] [bt] (1) /usr/local/lib/python2.7/dist-packages/mxnet-0.9.5-py2.7.egg/mxnet/libmxnet.so(_ZN5mxnet7storage23GPUPooledStorageManager5AllocEm+0x1d8) [0x7ff447715948] [bt] (2) /usr/local/lib/python2.7/dist-packages/mxnet-0.9.5-py2.7.egg/mxnet/libmxnet.so(_ZN5mxnet11StorageImpl5AllocEmNS_7ContextE+0x57) [0x7ff4477177d7] [bt] (3) /usr/local/lib/python2.7/dist-packages/mxnet-0.9.5-py2.7.egg/mxnet/libmxnet.so(+0xee6609) [0x7ff447430609] [bt] (4) /usr/local/lib/python2.7/dist-packages/mxnet-0.9.5-py2.7.egg/mxnet/libmxnet.so(_ZNSt17_Function_handlerIFvN5mxnet10RunContextENS0_6engine18CallbackOnCompleteEEZNS0_6Engine8PushSyncESt8functionIFvS1_EENS0_7ContextERKSt6vectorIPNS2_3VarESaISC_EESG_NS0_10FnPropertyEiPKcEUlS1_S3_E_E9_M_invokeERKSt9_Any_dataS1S3+0x23) [0x7ff446b608c3] [bt] (5) /usr/local/lib/python2.7/dist-packages/mxnet-0.9.5-py2.7.egg/mxnet/libmxnet.so(_ZN5mxnet6engine11NaiveEngine9PushAsyncESt8functionIFvNS_10RunContextENS0_18CallbackOnCompleteEEENS_7ContextERKSt6vectorIPNS0_3VarESaISA_EESE_NS_10FnPropertyEiPKc+0x8c) [0x7ff44735ca5c] [bt] (6) /usr/local/lib/python2.7/dist-packages/mxnet-0.9.5-py2.7.egg/mxnet/libmxnet.so(_ZN5mxnet6Engine8PushSyncESt8functionIFvNS_10RunContextEEENS_7ContextERKSt6vectorIPNS_6engine3VarESaIS9_EESD_NS_10FnPropertyEiPKc+0x124) [0x7ff446b62314] [bt] (7) /usr/local/lib/python2.7/dist-packages/mxnet-0.9.5-py2.7.egg/mxnet/libmxnet.so(_ZN5mxnet10CopyFromToERKNS_7NDArrayEPS0_i+0x62c) [0x7ff44743943c] [bt] (8) /usr/local/lib/python2.7/dist-packages/mxnet-0.9.5-py2.7.egg/mxnet/libmxnet.so(+0xf3f8e4) [0x7ff4474898e4] [bt] (9) /usr/local/lib/python2.7/dist-packages/mxnet-0.9.5-py2.7.egg/mxnet/libmxnet.so(MXImperativeInvoke+0x2cd) [0x7ff447330d0d]
Traceback (most recent call last): File "demo.py", line 142, in
main()
File "demo.py", line 137, in main
predictor = get_net(symbol, args.prefix, args.epoch, ctx)
File "demo.py", line 36, in get_net
arg_params, aux_params = load_param(prefix, epoch, convert=True, ctx=ctx, process=True)
File "/home/ubuntu/mxnet/example/rcnn/rcnn/utils/load_model.py", line 53, in load_param
arg_params = convert_context(arg_params, ctx)
File "/home/ubuntu/mxnet/example/rcnn/rcnn/utils/load_model.py", line 35, in convert_context
new_params[k] = v.as_in_context(ctx)
File "/usr/local/lib/python2.7/dist-packages/mxnet-0.9.5-py2.7.egg/mxnet/ndarray.py", line 871, in as_in_context
return self.copyto(context)
File "/usr/local/lib/python2.7/dist-packages/mxnet-0.9.5-py2.7.egg/mxnet/ndarray.py", line 820, in copyto
return _internal._copyto(self, out=hret)
File "/usr/local/lib/python2.7/dist-packages/mxnet-0.9.5-py2.7.egg/mxnet/_ctypes/ndarray.py", line 164, in generic_ndarray_function
c_array(ctypes.c_char_p, [c_str(val) for val in vals])))
File "/usr/local/lib/python2.7/dist-packages/mxnet-0.9.5-py2.7.egg/mxnet/base.py", line 78, in check_call
raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: [12:01:59] src/storage/./pooled_storage_manager.h:84: cudaMalloc failed: out of memory
Stack trace returned 10 entries: [bt] (0) /usr/local/lib/python2.7/dist-packages/mxnet-0.9.5-py2.7.egg/mxnet/libmxnet.so(_ZN4dmlc15LogMessageFatalD1Ev+0x3c) [0x7ff446af368c] [bt] (1) /usr/local/lib/python2.7/dist-packages/mxnet-0.9.5-py2.7.egg/mxnet/libmxnet.so(_ZN5mxnet7storage23GPUPooledStorageManager5AllocEm+0x1d8) [0x7ff447715948] [bt] (2) /usr/local/lib/python2.7/dist-packages/mxnet-0.9.5-py2.7.egg/mxnet/libmxnet.so(_ZN5mxnet11StorageImpl5AllocEmNS_7ContextE+0x57) [0x7ff4477177d7] [bt] (3) /usr/local/lib/python2.7/dist-packages/mxnet-0.9.5-py2.7.egg/mxnet/libmxnet.so(+0xee6609) [0x7ff447430609] [bt] (4) /usr/local/lib/python2.7/dist-packages/mxnet-0.9.5-py2.7.egg/mxnet/libmxnet.so(_ZNSt17_Function_handlerIFvN5mxnet10RunContextENS0_6engine18CallbackOnCompleteEEZNS0_6Engine8PushSyncESt8functionIFvS1_EENS0_7ContextERKSt6vectorIPNS2_3VarESaISC_EESG_NS0_10FnPropertyEiPKcEUlS1_S3_E_E9_M_invokeERKSt9_Any_dataS1S3+0x23) [0x7ff446b608c3] [bt] (5) /usr/local/lib/python2.7/dist-packages/mxnet-0.9.5-py2.7.egg/mxnet/libmxnet.so(_ZN5mxnet6engine11NaiveEngine9PushAsyncESt8functionIFvNS_10RunContextENS0_18CallbackOnCompleteEEENS_7ContextERKSt6vectorIPNS0_3VarESaISA_EESE_NS_10FnPropertyEiPKc+0x8c) [0x7ff44735ca5c] [bt] (6) /usr/local/lib/python2.7/dist-packages/mxnet-0.9.5-py2.7.egg/mxnet/libmxnet.so(_ZN5mxnet6Engine8PushSyncESt8functionIFvNS_10RunContextEEENS_7ContextERKSt6vectorIPNS_6engine3VarESaIS9_EESD_NS_10FnPropertyEiPKc+0x124) [0x7ff446b62314] [bt] (7) /usr/local/lib/python2.7/dist-packages/mxnet-0.9.5-py2.7.egg/mxnet/libmxnet.so(_ZN5mxnet10CopyFromToERKNS_7NDArrayEPS0_i+0x62c) [0x7ff44743943c] [bt] (8) /usr/local/lib/python2.7/dist-packages/mxnet-0.9.5-py2.7.egg/mxnet/libmxnet.so(+0xf3f8e4) [0x7ff4474898e4] [bt] (9) /usr/local/lib/python2.7/dist-packages/mxnet-0.9.5-py2.7.egg/mxnet/libmxnet.so(MXImperativeInvoke+0x2cd) [0x7ff447330d0d]
[12:01:59] src/engine/naive_engine.cc:35: Engine shutdown
Minimum reproducible example
if you are using your own code, please provide a short script that reproduces the error.
python demo.py --prefix final --epoch 0 --image bike.jpg --gpu 1
Steps to reproduce
or if you are running standard examples, please provide the commands you have run that lead to the error.
What have you tried to solve it?