zhreshold / mxnet-ssd

MXNet port of SSD: Single Shot MultiBox Object Detector. Reimplementation of https://github.com/weiliu89/caffe/tree/ssd
MIT License
763 stars 339 forks source link

I train ssd on a dataset which only has one class. When I run demo.py, there is an error. #33

Closed oyxhust closed 6 years ago

oyxhust commented 7 years ago

I want to use ssd to detect text in images, so I train ssd on ICDAR dataset. I successfully train this model, and get the saved params and json on the folder 'Model'. Then I want see the performance of my model, so I run demo.py, but there is an error: [15:02:26] /home/ubuntu/mxnet/dmlc-core/include/dmlc/logging.h:235: [15:02:26] src/ndarray/ndarray.cc:231: Check failed: from.shape() == to->shape() operands shape mismatch Traceback (most recent call last): File "demo.py", line 95, in ctx, args.nms_thresh, args.force_nms) File "demo.py", line 41, in get_detector data_shape, mean_pixels, ctx=ctx) File "/home/ubuntu/mxnet/example/ssd_LPR/detect/detector.py", line 39, in init self.mod.set_params(args, auxs) File "/home/ubuntu/mxnet/python/mxnet/module/base_module.py", line 503, in set_params allow_missing=allow_missing, force_init=force_init) File "/home/ubuntu/mxnet/python/mxnet/module/module.py", line 198, in init_params _impl(name, arr, arg_params) File "/home/ubuntu/mxnet/python/mxnet/module/module.py", line 188, in _impl cache_arr.copyto(arr) File "/home/ubuntu/mxnet/python/mxnet/ndarray.py", line 533, in copyto return _internal._copyto(self, out=other) File "/home/ubuntu/mxnet/python/mxnet/ndarray.py", line 1225, in unary_ndarray_function c_array(ctypes.c_char_p, [c_str(str(i)) for i in kwargs.values()]))) File "/home/ubuntu/mxnet/python/mxnet/base.py", line 77, in check_call raise MXNetError(py_str(_LIB.MXGetLastError())) mxnet.base.MXNetError: [15:02:26] src/ndarray/ndarray.cc:231: Check failed: from.shape() == to->shape() operands shape mismatch

I have changed the class in demo.py like this: CLASSES = ('text')

oyxhust commented 7 years ago

I find I need to use two classes like 'text' and 'backgroud' in training or demo.py. In this way, I get no error. But I don't understand why only one class can not work.

zhreshold commented 7 years ago

Have you modified the class name in dataset class?

oyxhust commented 7 years ago

Yes, I have wrote a new data loading files in folder "dataset" which refers to "imdb.py", and I have modified "self.classes". I get no error when I set two classes like 'text' and 'backgroud' both in dataset class and "demo.py". However, if I use only one classes, I get such error.

zhreshold commented 7 years ago

Before

self.mod.set_params(args, auxs)

Can you print the shapes of params in args?

BunnyShan commented 7 years ago

I'm meeting the same error and print args as follows: conv2_1_bias (128L,) relu7_cls_pred_conv_bias (126L,) relu8_2_cls_pred_conv_weight (126L, 512L, 3L, 3L) pool10_loc_pred_conv_bias (24L,) conv4_2_bias (512L,) relu4_3_cls_pred_conv_bias (63L,) pool10_cls_pred_conv_bias (126L,) relu8_2_cls_pred_conv_bias (126L,) pool10_cls_pred_conv_weight (126L, 256L, 3L, 3L) conv4_3_bias (512L,) conv9_2_bias (256L,) conv5_2_weight (512L, 512L, 3L, 3L) relu10_2_cls_pred_conv_bias (126L,) conv9_1_weight (128L, 512L, 1L, 1L) relu7_loc_pred_conv_weight (24L, 1024L, 3L, 3L) conv5_3_bias (512L,) conv5_1_weight (512L, 512L, 3L, 3L) conv7_weight (1024L, 1024L, 1L, 1L) conv4_1_bias (512L,) relu7_loc_pred_conv_bias (24L,) relu4_3_cls_pred_conv_weight (63L, 512L, 3L, 3L) relu4_3_loc_pred_conv_bias (12L,) conv6_weight (1024L, 512L, 3L, 3L) relu10_2_loc_pred_conv_weight (24L, 256L, 3L, 3L) relu8_2_loc_pred_conv_weight (24L, 512L, 3L, 3L) conv4_3_weight (512L, 512L, 3L, 3L) relu10_2_cls_pred_conv_weight (126L, 256L, 3L, 3L) relu7_cls_pred_conv_weight (126L, 1024L, 3L, 3L) conv5_1_bias (512L,) conv3_1_weight (256L, 128L, 3L, 3L) conv3_2_weight (256L, 256L, 3L, 3L) conv2_2_bias (128L,) conv1_2_bias (64L,) relu8_2_loc_pred_conv_bias (24L,) relu4_3_scale (1L, 512L, 1L, 1L) relu10_2_loc_pred_conv_bias (24L,) pool10_loc_pred_conv_weight (24L, 256L, 3L, 3L) conv8_2_weight (512L, 256L, 3L, 3L) conv6_bias (1024L,) conv7_bias (1024L,) conv3_2_bias (256L,) relu9_2_cls_pred_conv_bias (126L,) conv9_1_bias (128L,) conv3_1_bias (256L,) conv9_2_weight (256L, 128L, 3L, 3L) conv10_2_weight (256L, 128L, 3L, 3L) conv10_1_weight (128L, 256L, 1L, 1L) conv1_1_weight (64L, 3L, 3L, 3L) conv8_1_weight (256L, 1024L, 1L, 1L) relu4_3_loc_pred_conv_weight (12L, 512L, 3L, 3L) conv3_3_bias (256L,) conv8_2_bias (512L,) conv5_3_weight (512L, 512L, 3L, 3L) conv2_2_weight (128L, 128L, 3L, 3L) conv1_2_weight (64L, 64L, 3L, 3L) conv10_2_bias (256L,) relu9_2_loc_pred_conv_bias (24L,) conv10_1_bias (128L,) conv2_1_weight (128L, 64L, 3L, 3L) conv3_3_weight (256L, 256L, 3L, 3L) relu9_2_cls_pred_conv_weight (126L, 256L, 3L, 3L) conv8_1_bias (256L,) relu9_2_loc_pred_conv_weight (24L, 256L, 3L, 3L) conv1_1_bias (64L,) conv5_2_bias (512L,) conv4_2_weight (512L, 512L, 3L, 3L) conv4_1_weight (512L, 256L, 3L, 3L)

BunnyShan commented 7 years ago

error solved, CLASSES must be like CLASSES = ('text',) not CLASSES = ('text'), or len(CLASSES) will be 4, not 1.

KeyKy commented 7 years ago

@oyxhust Did your ssd converge? Could you share your dataset.py? I also train a one class ssd on fddb but Train-ObjectAcc=0.026159.

oyxhust commented 7 years ago

@BunnyShan Thank you a lot! @KeyKy It can converge. My 'dataset.py' just refers to the examples, and it is not very complex. I think the reason may not be "dataset.py" which is just the input file. I think you should check some mistake like that you mistake the x location as y.

xinghedyc commented 7 years ago

@oyxhust Hi, I'm also using ssd to detect text on ICDAR2013 dataset, I changed the dataset to pascal-voc format , so I only need change the data path of the imdb. But I find it's hard to get converged, this is the result after 100 epochs with learning rate 0.0005:

default

and 300 epochs seems not to make much difference. Could you share your hyper-parameters or offer some help? BTW I'm using the latest mxnet version. thanks you very much!

oyxhust commented 7 years ago

@xinghedyc 1、 I don't train my model directly on ICDAR, because it is not big enough. I suggest that you can train your model on SynthText, and then fine tune it on ICDAR. In this way, I believe you can gain a good performance. 2、 For the hyper-parameters setting, if you use this model for some application like license plate detection, I think the original SSD model is enough. If you want to get a better performance, you can refer to this paper (http://mc.eistar.net/UpLoadFiles/Papers/TextBoxes-AAAI17-draft.pdf) which is from my friend's lab in HUST. It can be very easy to realize, you just need to change some size of convolution kernel and default boxes to make them more suitable for text detection.

xinghedyc commented 7 years ago

@oyxhust Thanks for your reply!

rongrongxiangxin commented 7 years ago

@oyxhust I got the same error as follows. How to solve this problem? Thanks.

Traceback (most recent call last): File "demo.py", line 101, in ctx, args.nms_thresh, args.force_nms) File "demo.py", line 44, in get_detector data_shape, mean_pixels, ctx=ctx) File "/home/tx-eva-02/mxnet/example/ssd/detect/detector.py", line 38, in init self.mod.set_params(args, auxs) File "/home/tx-eva-02/mxnet/python/mxnet/module/base_module.py", line 557, in set_params allow_missing=allow_missing, force_init=force_init) File "/home/tx-eva-02/mxnet/python/mxnet/module/module.py", line 264, in init_params _impl(desc, arr, arg_params) File "/home/tx-eva-02/mxnet/python/mxnet/module/module.py", line 252, in _impl cache_arr.copyto(arr) File "/home/tx-eva-02/mxnet/python/mxnet/ndarray.py", line 556, in copyto return _internal._copyto(self, out=other) File "/home/tx-eva-02/mxnet/python/mxnet/_ctypes/ndarray.py", line 131, in generic_ndarray_function c_array(ctypes.c_char_p, [c_str(str(i)) for i in kwargs.values()]))) File "/home/tx-eva-02/mxnet/python/mxnet/base.py", line 77, in check_call raise MXNetError(py_str(_LIB.MXGetLastError())) mxnet.base.MXNetError: [12:21:15] src/ndarray/ndarray.cc:239: Check failed: from.shape() == to->shape() operands shape mismatchfrom.shape = (126,) to.shape=(12,)

123chengbo commented 7 years ago

could you share the ssd_model?,for example baidudisk ..... thank you so much。