GabrielDornelles / pytorch-ocr

Simple Pytorch framework to train OCRs. Supports CRNNs, Attention, CTC and Cross Entropy Loss.
MIT License
72 stars 16 forks source link

Test predictions length not exceeding 5 #11

Closed devarshi16 closed 10 months ago

devarshi16 commented 10 months ago

I am trying to train a number plate recognition model. The lengths can be upto 13 characters long. But the test predictions that I see after each epoch has at max 5 characters in it. What am I missing?

config/config.yaml:

defaults:
  - override hydra/job_logging: custom

processing:
  device: cuda
  image_width: 200
  image_height: 60 

training:
  lr: 3e-4
  batch_size: 1
  num_workers: 10
  num_epochs: 20
  device: "cuda"

bools:
  DISPLAY_ONLY_WRONG_PREDICTIONS: false
  VIEW_INFERENCE_WHILE_TRAINING: true
  SAVE_CHECKPOINTS: false

paths:
  dataset_dir: ./dataset
  save_model_as: ./logs/crnn.pth

model:
  use_attention: true 
  use_ctc: true
  gray_scale: false
  dims: 256

Also, Accuracy and Best Accuracy columns are always 0.0 for some reason.

Note: if I keep anything other than batch size 1 I am gettig the error "Trying to resize storage that is not resizable" when calling pin_memory() on some zero-dimensional tensors although it looks like it might be a torch version issue Similar issue in ultralytics

Will appreciate any help thank you.

devarshi16 commented 10 months ago

EDIT: I turned off CTC loss and now it's not limiting the number of characters. Also, it seems the accuracy earlier was too small to refelct in the per epoch output, now it's reflecting.

GabrielDornelles commented 10 months ago

Hi! It happens that I made padding automatic for cross entropy and kept it manually for CTC. Padding is the action of making every training sample the same size (num of characters).

So if you have some plates with 6 chars, other with 8, and at max 13, choose a character to represent empty space (like "$" or "_", anything that is not one of the training characters), and use it to make every plate the same size. Example

before: adm167 iuw68990ju90k

after: adm167$$$$$$$ iuw68990ju90k

Then at your predictions just make a replace of "$" to ""(nothing).

Also as you said, experiment training longer, usually I give my models 100 epochs for CTC loss and it reachs plateau before that but keeps getting better on edge cases.

devarshi16 commented 10 months ago

Thanks. Added the following lines in utils/data_loading.py

sequence_length=15
for x in targets:
        x.extend(['$']*(sequence_length-len(x)))

Working as intented and seeing longer sequences in subsequent epochs. Thanks for the training advice.