tusen-ai / simpledet

A Simple and Versatile Framework for Object Detection and Instance Recognition
Apache License 2.0
3.08k stars 488 forks source link

Invalid NDArray file format #270

Open ylc2580 opened 4 years ago

ylc2580 commented 4 years ago

按照您里面写的步骤,依次运行,运行到python detection_train.py --config config/faster_r50v1_fpn_1x.py这句代码时候程序出错,错误信息如下: load pretrain_model/resnet-v1-50-0000.params Traceback (most recent call last): File "detection_train.py", line 311, in train_net(parse_args()) File "detection_train.py", line 135, in train_net arg_params, aux_params = load_checkpoint(pretrain_prefix, pretrain_epoch) File "/media/ubuntu_data2/02_dataset/Audio_Classification/\u5b89\u88c5mxnet\u4e34\u65f6\u5efa/simpledet/utils/load_model.py", line 31, in load_checkpoint save_dict = mx.nd.load('./pretrain_model/resnet-v1-50-0000.params') File "/root/anaconda3/envs/python37/lib/python3.7/site-packages/mxnet/ndarray/utils.py", line 175, in load ctypes.byref(names))) File "/root/anaconda3/envs/python37/lib/python3.7/site-packages/mxnet/base.py", line 254, in check_call raise MXNetError(py_str(_LIB.MXGetLastError())) mxnet.base.MXNetError: [10:00:45] src/ndarray/ndarray.cc:1851: Check failed: fi->Read(data): Invalid NDArray file format Stack trace: [bt] (0) /root/anaconda3/envs/python37/lib/python3.7/site-packages/mxnet/libmxnet.so(+0x56bffb) [0x7f7011024ffb] [bt] (1) /root/anaconda3/envs/python37/lib/python3.7/site-packages/mxnet/libmxnet.so(mxnet::NDArray::Load(dmlc::Stream, std::vector<mxnet::NDArray, std::allocator >, std::vector<std::string, std::allocator >*)+0x1d6) [0x7f70137d7756] [bt] (2) /root/anaconda3/envs/python37/lib/python3.7/site-packages/mxnet/libmxnet.so(MXNDArrayLoad+0x263) [0x7f7013522fc3] [bt] (3) /root/anaconda3/envs/python37/lib/python3.7/lib-dynload/../../libffi.so.6(ffi_call_unix64+0x4c) [0x7f704c11eec0] [bt] (4) /root/anaconda3/envs/python37/lib/python3.7/lib-dynload/../../libffi.so.6(ffi_call+0x22d) [0x7f704c11e87d] [bt] (5) /root/anaconda3/envs/python37/lib/python3.7/lib-dynload/_ctypes.cpython-37m-x86_64-linux-gnu.so(_ctypes_callproc+0x2ce) [0x7f704c33401e] [bt] (6) /root/anaconda3/envs/python37/lib/python3.7/lib-dynload/_ctypes.cpython-37m-x86_64-linux-gnu.so(+0x12a54) [0x7f704c334a54] [bt] (7) python(_PyObject_FastCallKeywords+0x49b) [0x560441c0d19b] [bt] (8) python(_PyEval_EvalFrameDefault+0x52e6) [0x560441c724d6]

terminate called without an active exception

特别说明:我这里的cuda为10.0,和你的不一样。出现这个问题无法训练是我的配置问题吗?还是说其他问题?谢谢

RogerChern commented 4 years ago

Please check your MXNet version. This happens generally when you try to load a model from a very old version of MXNet.

On Tue, Dec 10, 2019 at 10:16 AM ylc2580 notifications@github.com wrote:

按照您里面写的步骤,依次运行,运行到python detection_train.py --config config/faster_r50v1_fpn_1x.py这句代码时候程序出错,错误信息如下: load pretrain_model/resnet-v1-50-0000.params Traceback (most recent call last): File "detection_train.py", line 311, in train_net(parse_args()) File "detection_train.py", line 135, in train_net arg_params, aux_params = load_checkpoint(pretrain_prefix, pretrain_epoch) File "/media/ubuntu_data2/02_dataset/Audio_Classification/\u5b89\u88c5mxnet\u4e34\u65f6\u5efa/simpledet/utils/load_model.py", line 31, in load_checkpoint save_dict = mx.nd.load('./pretrain_model/resnet-v1-50-0000.params') File "/root/anaconda3/envs/python37/lib/python3.7/site-packages/mxnet/ndarray/utils.py", line 175, in load ctypes.byref(names))) File "/root/anaconda3/envs/python37/lib/python3.7/site-packages/mxnet/base.py", line 254, in check_call raise MXNetError(py_str(_LIB.MXGetLastError())) mxnet.base.MXNetError: [10:00:45] src/ndarray/ndarray.cc:1851: Check failed: fi->Read(data): Invalid NDArray file format Stack trace: [bt] (0) /root/anaconda3/envs/python37/lib/python3.7/site-packages/mxnet/libmxnet.so(+0x56bffb) [0x7f7011024ffb] [bt] (1) /root/anaconda3/envs/python37/lib/python3.7/site-packages/mxnet/libmxnet.so(mxnet::NDArray::Load(dmlc::Stream, std::vector<mxnet::NDArray, std::allocatormxnet::NDArray >, std::vector<std::string, std::allocatorstd::string >*)+0x1d6) [0x7f70137d7756] [bt] (2) /root/anaconda3/envs/python37/lib/python3.7/site-packages/mxnet/libmxnet.so(MXNDArrayLoad+0x263) [0x7f7013522fc3] [bt] (3) /root/anaconda3/envs/python37/lib/python3.7/lib-dynload/../../libffi.so.6(ffi_call_unix64+0x4c) [0x7f704c11eec0] [bt] (4) /root/anaconda3/envs/python37/lib/python3.7/lib-dynload/../../libffi.so.6(fficall+0x22d) [0x7f704c11e87d] [bt] (5) /root/anaconda3/envs/python37/lib/python3.7/lib-dynload/ ctypes.cpython-37m-x86_64-linux-gnu.so(_ctypescallproc+0x2ce) [0x7f704c33401e] [bt] (6) /root/anaconda3/envs/python37/lib/python3.7/lib-dynload/ ctypes.cpython-37m-x86_64-linux-gnu.so(+0x12a54) [0x7f704c334a54] [bt] (7) python(_PyObject_FastCallKeywords+0x49b) [0x560441c0d19b] [bt] (8) python(_PyEval_EvalFrameDefault+0x52e6) [0x560441c724d6]

terminate called without an active exception

特别说明:我这里的cuda为10.0,和你的不一样。出现这个问题无法训练是我的配置问题吗?还是说其他问题?谢谢

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/TuSimple/simpledet/issues/270?email_source=notifications&email_token=ABGODH33447CEOP5XDYKX43QX33ZBA5CNFSM4JYWSXR2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4H7KJVGA, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABGODH5EYMMA5EZIFYRTAC3QX33ZBANCNFSM4JYWSXRQ .

ylc2580 commented 4 years ago

我是安装咱们的步骤来的,以下是mxnet的版本。是否正确或者过低?谢谢。

import mxnet as mx mx.version '1.6.0'

RogerChern commented 4 years ago

Which pip wheel did you installed? https://1dv.alarge.space/mxnet_cu101-1.6.0b20190820-py2.py3-none-manylinux1_x86_64.whl or https://1dv.alarge.space/mxnet_cu100-1.6.0b20190820-py2.py3-none-manylinux1_x86_64.whl

ylc2580 commented 4 years ago

yes, i installed it by this command "https://1dv.alarge.space/mxnet_cu100-1.6.0b20190820-py2.py3-none-manylinux1_x86_64.whl",but it still had this problem. it so -.-......

RogerChern commented 4 years ago

Interesting, could you please try another pretrained model?

On Thu, Dec 12, 2019 at 5:31 PM ylc2580 notifications@github.com wrote:

yes, i installed it by this command " https://1dv.alarge.space/mxnet_cu100-1.6.0b20190820-py2.py3-none-manylinux1_x86_64.whl",but it still had this problem. it so -.-......

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/TuSimple/simpledet/issues/270?email_source=notifications&email_token=ABGODH5QF5ZF2WAWL6VFOVLQYIAGNA5CNFSM4JYWSXR2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGWBBRA#issuecomment-564924612, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABGODH4EUNO5MFRRBPOCTKLQYIAGNANCNFSM4JYWSXRQ .

ylc2580 commented 4 years ago

(1) it is very amazing. i delete all download pretrained model myself and let code download itself , and now it can work.

RogerChern commented 4 years ago

Faster R-CNN FPN uses less than 5G. How about change the gpu id to utilize the two 1080 and see if the OOM problem persist.

On Fri, Dec 13, 2019 at 10:17 AM ylc2580 notifications@github.com wrote:

(1) it is very amazing. i delete all download pretrained model myself and let code download itself , and now it can work. (2) i see every model you set need seven gpu? my gpu have about 26GB(one k80 and two 1080),it is out memory,why?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/TuSimple/simpledet/issues/270?email_source=notifications&email_token=ABGODHYSM47X5Y4LZ7RO2GLQYLWFJA5CNFSM4JYWSXR2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGYVYEQ#issuecomment-565271570, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABGODHYWCRK7OIYZL6ZD2JDQYLWFJANCNFSM4JYWSXRQ .

dongjuns commented 4 years ago

Refet to https://github.com/TuSimple/simpledet/blob/master/MODEL_ZOO.md

...

ImageNet Pretrained Models

# download them yourself in, ~/simpledet/pretrain_model
wget https://1dv.aflat.top/resnet-v1-50-0000.params
wget https://1dv.aflat.top/resnet-v1-101-0000.params
wget https://1dv.aflat.top/resnet-50-0000.params
wget https://1dv.aflat.top/resnet-101-0000.params
wget https://1dv.aflat.top/resnet50_v1b-0000.params
wget https://1dv.aflat.top/resnet101_v1b-0000.params
wget https://1dv.aflat.top/resnet152_v1b-0000.params
wget https://1dv.aflat.top/resnext-101-64x4d-0000.params
wget https://1dv.aflat.top/resnext-101-32x8d-0000.params
wget https://1dv.aflat.top/resnext-152-32x8d-IN5k-0000.params