Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
I would like to use my vgg16_atrous model that I have trained using gluon-cv.
I'm on Windows Server 2016 with CUDA 8.0 and the GPU is a Tesla P40 (driver 385.08).
I tried to execute this code :
import mxnet as mx
from gluoncv import data, utils, model_zoo
net_name = 'ssd_300_vgg16_atrous_voc'
resume = './ssd_300_vgg16_atrous_voc_0100_0.8975.params'
net = model_zoo.get_model(net_name, pretrained_base=True, ctx=mx.gpu(0))
net.load_params(resume.strip(),ctx=mx.gpu(0))
Each time my python crashes but when I use cpu context it works fine.
So to debug the gpu part I used this script:
import mxnet as mx
print(mx.gpu(0))
print(mx.nd.array([1,2],ctx=mx.gpu(0)))
The answer was:
gpu(0) Traceback (most recent call last): File "test_gpu.py", line 5, in print(mx.nd.array([1,2],ctx=mx.gpu(0))) File "C:\Users\Administrateur.WIN-JNTSDGOVCTG\Miniconda3\envs\alan\lib\site-packages\mxnet\ndarray\utils.py", line 146, in array return _array(source_array, ctx=ctx, dtype=dtype) File "C:\Users\Administrateur.WIN-JNTSDGOVCTG\Miniconda3\envs\alan\lib\site-packages\mxnet\ndarray\ndarray.py", line 2338, in array arr = empty(source_array.shape, ctx, dtype) File "C:\Users\Administrateur.WIN-JNTSDGOVCTG\Miniconda3\envs\alan\lib\site-packages\mxnet\ndarray\ndarray.py", line 3548, in empty return NDArray(handle=_new_alloc_handle(shape, ctx, False, dtype)) File "C:\Users\Administrateur.WIN-JNTSDGOVCTG\Miniconda3\envs\alan\lib\site-packages\mxnet\ndarray\ndarray.py", line 139, in _new_alloc_handle ctypes.byref(hdl))) File "C:\Users\Administrateur.WIN-JNTSDGOVCTG\Miniconda3\envs\alan\lib\site-packages\mxnet\base.py", line 149, in check_call raise MXNetError(py_str(_LIB.MXGetLastError())) mxnet.base.MXNetError: [17:20:03] c:\jenkins\workspace\mxnet-tag\mxnet\src\storage./pooled_storage_manager.h:108: cudaMalloc failed: device kernel image is invalid
Looking in forums I found this solution :
Downgrade mxnet to 1.1.0
After downgraded mxnet retried my code to debug the gpu part and it worked fine as the answer was :
gpu(0) [1. 2.] <NDArray 2 @gpu(0)>
But then when I come back to my previous code I have this error:
Traceback (most recent call last): File "test_video.py", line 97, in main() File "test_video.py", line 65, in main net = model_zoo.get_model(net_name, pretrained_base=True, ctx=mx.gpu(0)) File "C:\Users\Administrateur.WIN-JNTSDGOVCTG\Miniconda3\envs\alan\lib\site-packages\gluoncv\model_zoo\model_zoo.py", line 105, in get_model net = modelsname File "C:\Users\Administrateur.WIN-JNTSDGOVCTG\Miniconda3\envs\alan\lib\site-packages\gluoncv\model_zoo\ssd\ssd.py", line 287, in ssd_300_vgg16_atrous_voc pretrained_base=pretrained_base, kwargs) File "C:\Users\Administrateur.WIN-JNTSDGOVCTG\Miniconda3\envs\alan\lib\site-packages\gluoncv\model_zoo\ssd\ssd.py", line 258, in get_ssd pretrained=pretrained_base, classes=classes, ctx=ctx, kwargs) File "C:\Users\Administrateur.WIN-JNTSDGOVCTG\Miniconda3\envs\alan\lib\site-packages\gluoncv\model_zoo\ssd\ssd.py", line 121, in init self.features = features(pretrained=pretrained, ctx=ctx) File "C:\Users\Administrateur.WIN-JNTSDGOVCTG\Miniconda3\envs\alan\lib\site-packages\gluoncv\model_zoo\ssd\vgg_atrous.py", line 204, in vgg16_atrous_300 return get_vgg_atrous_extractor(16, 300, kwargs) File "C:\Users\Administrateur.WIN-JNTSDGOVCTG\Miniconda3\envs\alan\lib\site-packages\gluoncv\model_zoo\ssd\vgg_atrous.py", line 193, in get_vgg_atrous_extractor net = VGGAtrousExtractor(layers, filters, extras, kwargs) File "C:\Users\Administrateur.WIN-JNTSDGOVCTG\Miniconda3\envs\alan\lib\site-packages\gluoncv\model_zoo\ssd\vgg_atrous.py", line 112, in init super(VGGAtrousExtractor, self).init(layers, filters, batch_norm, **kwargs) File "C:\Users\Administrateur.WIN-JNTSDGOVCTG\Miniconda3\envs\alan\lib\site-packages\gluoncv\model_zoo\ssd\vgg_atrous.py", line 64, in init self.init_scale = self.params.get_constant('init_scale', init_scale) AttributeError: 'ParameterDict' object has no attribute 'get_constant'
I looked it up on forums and the solution proposed is to update mxnet ...
Hello,
I would like to use my vgg16_atrous model that I have trained using gluon-cv.
I'm on Windows Server 2016 with CUDA 8.0 and the GPU is a Tesla P40 (driver 385.08).
I tried to execute this code :
Each time my python crashes but when I use cpu context it works fine.
So to debug the gpu part I used this script:
The answer was:
gpu(0) Traceback (most recent call last): File "test_gpu.py", line 5, in print(mx.nd.array([1,2],ctx=mx.gpu(0))) File "C:\Users\Administrateur.WIN-JNTSDGOVCTG\Miniconda3\envs\alan\lib\site-packages\mxnet\ndarray\utils.py", line 146, in array return _array(source_array, ctx=ctx, dtype=dtype) File "C:\Users\Administrateur.WIN-JNTSDGOVCTG\Miniconda3\envs\alan\lib\site-packages\mxnet\ndarray\ndarray.py", line 2338, in array arr = empty(source_array.shape, ctx, dtype) File "C:\Users\Administrateur.WIN-JNTSDGOVCTG\Miniconda3\envs\alan\lib\site-packages\mxnet\ndarray\ndarray.py", line 3548, in empty return NDArray(handle=_new_alloc_handle(shape, ctx, False, dtype)) File "C:\Users\Administrateur.WIN-JNTSDGOVCTG\Miniconda3\envs\alan\lib\site-packages\mxnet\ndarray\ndarray.py", line 139, in _new_alloc_handle ctypes.byref(hdl))) File "C:\Users\Administrateur.WIN-JNTSDGOVCTG\Miniconda3\envs\alan\lib\site-packages\mxnet\base.py", line 149, in check_call raise MXNetError(py_str(_LIB.MXGetLastError())) mxnet.base.MXNetError: [17:20:03] c:\jenkins\workspace\mxnet-tag\mxnet\src\storage./pooled_storage_manager.h:108: cudaMalloc failed: device kernel image is invalid
Looking in forums I found this solution :
After downgraded mxnet retried my code to debug the gpu part and it worked fine as the answer was :
gpu(0) [1. 2.] <NDArray 2 @gpu(0)>
But then when I come back to my previous code I have this error:
Traceback (most recent call last): File "test_video.py", line 97, in main() File "test_video.py", line 65, in main net = model_zoo.get_model(net_name, pretrained_base=True, ctx=mx.gpu(0)) File "C:\Users\Administrateur.WIN-JNTSDGOVCTG\Miniconda3\envs\alan\lib\site-packages\gluoncv\model_zoo\model_zoo.py", line 105, in get_model net = modelsname File "C:\Users\Administrateur.WIN-JNTSDGOVCTG\Miniconda3\envs\alan\lib\site-packages\gluoncv\model_zoo\ssd\ssd.py", line 287, in ssd_300_vgg16_atrous_voc pretrained_base=pretrained_base, kwargs) File "C:\Users\Administrateur.WIN-JNTSDGOVCTG\Miniconda3\envs\alan\lib\site-packages\gluoncv\model_zoo\ssd\ssd.py", line 258, in get_ssd pretrained=pretrained_base, classes=classes, ctx=ctx, kwargs) File "C:\Users\Administrateur.WIN-JNTSDGOVCTG\Miniconda3\envs\alan\lib\site-packages\gluoncv\model_zoo\ssd\ssd.py", line 121, in init self.features = features(pretrained=pretrained, ctx=ctx) File "C:\Users\Administrateur.WIN-JNTSDGOVCTG\Miniconda3\envs\alan\lib\site-packages\gluoncv\model_zoo\ssd\vgg_atrous.py", line 204, in vgg16_atrous_300 return get_vgg_atrous_extractor(16, 300, kwargs) File "C:\Users\Administrateur.WIN-JNTSDGOVCTG\Miniconda3\envs\alan\lib\site-packages\gluoncv\model_zoo\ssd\vgg_atrous.py", line 193, in get_vgg_atrous_extractor net = VGGAtrousExtractor(layers, filters, extras, kwargs) File "C:\Users\Administrateur.WIN-JNTSDGOVCTG\Miniconda3\envs\alan\lib\site-packages\gluoncv\model_zoo\ssd\vgg_atrous.py", line 112, in init super(VGGAtrousExtractor, self).init(layers, filters, batch_norm, **kwargs) File "C:\Users\Administrateur.WIN-JNTSDGOVCTG\Miniconda3\envs\alan\lib\site-packages\gluoncv\model_zoo\ssd\vgg_atrous.py", line 64, in init self.init_scale = self.params.get_constant('init_scale', init_scale) AttributeError: 'ParameterDict' object has no attribute 'get_constant'
I looked it up on forums and the solution proposed is to update mxnet ...
What can I do to resolve this issue ?
Thanks in advance