meijieru / crnn.pytorch

Convolutional recurrent network in pytorch
MIT License
2.38k stars 658 forks source link

Ask some questions about accuray #69

Closed wineternity closed 6 years ago

wineternity commented 6 years ago

I am newbie in DL, and try to demo the crnn_main.py script with 5000 train samples with batchsize=100 . The alphabet is all digits and a-zA-Z.

[22/25][20/50] Loss: 32.413400 Start val -------------------------- => , gt: 354141
-------------------------- => , gt: 342029
-------------------------- => , gt: 54806C
-------------------------- => , gt: 104101056
-------------------------- => , gt: 541001
214
-------------------------- => , gt: 501461
-------------------------- => , gt: 466051021
-------------------------- => , gt: 322025
144
-------------------------- => , gt: 203001658
-------------------------- => , gt: 530010
012
Test loss: 27.896337, accuray: 0.000000 [22/25][30/50] Loss: 33.099444 [22/25][40/50] Loss: 29.278045 Start val 3------------------------- => 3 , gt: 324401100
3------------------------- => 3 , gt: 500214
3------------------------- => 3 , gt: 530010
012
3------------------------- => 3 , gt: 466453
3------------------------- => 3 , gt: 447009023
3------------------------- => 3 , gt: 403051
019
3------------------------- => 3 , gt: 930001623
3------------------------- => 3 , gt: 326000
050
3------------------------- => 3 , gt: 661258
3------------------------- => 3 , gt: 422051*001
Test loss: 24.278810, accuray: 0.000000

May I ask why the result is so bad. it is possible that caused by the small size of train samplers or I made some mistakes for create sample image for train(I use PIL to draw the string on to a background image )?

Could you give a tip. And I have backspace in the image string Like" 100 100" and "100100" on images on map to the label 100100. will this cause mistake.

meijieru commented 6 years ago

Here are some related issue previously, you may refer to them. Also, you could check your dataset out by dump some of them. It will not cause mistake.

wineternity commented 6 years ago

Thanks, now I can train the dataset. And my dataset is train for variable length. I found in training new model the instruction said "please sort the image according to the text length."

  1. May I ask did I need to sort the image before create the lmdb dataset from training? As I found in crnn_main.py when load dataset the data is shuffled and the sort will have no meaning
  2. May I ask why we need sort the variable length data? For my result I found the validate accuracy is good at the small length and the result is bad at longer length
meijieru commented 6 years ago
  1. Yes. When use variable length, they will not be shuffled.
  2. Because of the my implementation.
wineternity commented 6 years ago

train_loader = torch.utils.data.DataLoader( train_dataset, batch_size=opt.batchSize, shuffle=True, sampler=sampler, num_workers=int(opt.workers), collate_fn=dataset.alignCollate(imgH=opt.imgH, imgW=opt.imgW, keep_ratio=opt.keep_ratio))

But from code, shuffle is always True. Where is the code need the dataset ordered by length. And what if implemented the code without ordered variable length, if the crnn network have limitation?

meijieru commented 6 years ago

As from the source code of pytorch, you will see the option is not ignored only when the sampler is not specified.

wineternity commented 6 years ago

Thanks so much for your kindly explanation, now I think I understand your design. If --random_sample is used, dataloader is use shuffle = true And without this flag, dataloader will use the sampler RandomSequentialSampler which will keep the order of the random data.

But I always need to change shuffle to False manually as pytorch did not support sample=None and shuffle=True together. Errors is like below, but this one may be relative to my pytorch version, not a big deal. collate_fn=dataset.alignCollate(imgH=opt.imgH, imgW=opt.imgW, keep_ratio=opt.keep_ratio)) File "/home/animal/Tool/anaconda2/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 287, in init raise ValueError('sampler is mutually exclusive with shuffle') ValueError: sampler is mutually exclusive with shuffle


At last , may I ask the order have influence on the cost or accuracy of network? As I run data with shuffled, the train dataset also have good output.

meijieru commented 6 years ago

It might be the problem of pytorch's version.

As you can see from the collate function, they will be resized to the same size. So if the images from the same batch does not have similar ratio, they will be distorted. You could train it of course, but may deteriorate the performance.