Topdu / OpenOCR

Apache License 2.0
141 stars 16 forks source link

SMTR bug #8

Closed MxDevAcc closed 1 week ago

MxDevAcc commented 1 month ago

Hello! Great work! I try to train SMTR model, but i got error with Eval part

  File "openocr/openocr/OpenOCR/openrec/modeling/decoders/smtr_decoder.py", line 218, in forward
    return self.forward_test_bi(x)
  File "openocr/openocr/OpenOCR/openrec/modeling/decoders/smtr_decoder.py", line 352, in forward_test_bi
    return torch.concat(next_logits_all + pre_logits_all[::-1], 1)
RuntimeError: torch.cat(): expected a non-empty list of Tensors

I try

Eval:
  dataset:
    name: RatioDataSet
    data_dir_list: ['data/val/regular_validation']
    transforms:
      - DecodeImage: # load image
          img_mode: BGR
          channel_first: False
      - ARLabelEncode: # Class handling label
          max_text_length: 200
      - SliceResize:
          image_shape: [3, 32, 128]
          padding: False
          max_ratio: 12
      - KeepKeys:
          keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order
  sampler:
    name: RatioSampler
    scales: [[128, 32]] # w, h
    # divide_factor: to ensure the width and height dimensions can be devided by downsampling multiple
    first_bs: 128
    fix_bs: false
    divided_factor: [4, 16] # w, h
    is_training: False
  loader:
    shuffle: False
    drop_last: False
    batch_size_per_card: 1
    num_workers: 2

and

Eval:
  dataset:
    name: LMDBDataSet
    data_dir: data/val/regular_validation
    transforms:
      - DecodeImage: # load image
          img_mode: BGR
          channel_first: False
      - ARLabelEncode: # Class handling label
          max_text_length: 200
      - SliceResize:
          image_shape: [3, 32, 128]
          padding: False
          max_ratio: 12
      - KeepKeys:
          keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order
  loader:
    shuffle: False
    drop_last: False
    batch_size_per_card: 1
    num_workers: 2
Eval:
  dataset:
    name: STRLMDBDataSet
    data_dir: data/val/regular_validation
    transforms:
      - DecodeImage: # load image
          img_mode: BGR
          channel_first: False
      - ARLabelEncode: # Class handling label
          max_text_length: 200
      - SliceResize:
          image_shape: [3, 32, 128]
          padding: False
          max_ratio: 12
      - KeepKeys:
          keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order
  loader:
    shuffle: False
    drop_last: False
    batch_size_per_card: 1
    num_workers: 2
Topdu commented 1 month ago

Please refer to the file focalsvtr_smtr to configure the training and evaluation data, and focalsvtr_smtr_long for long text evaluation after training.

MxDevAcc commented 1 month ago

Please refer to the file focalsvtr_smtr to configure the training and evaluation data, and focalsvtr_smtr_long for long text evaluation after training.

https://github.com/Topdu/OpenOCR/blob/d2968164e7d82bdc0f99d0130789b23241ff381a/openrec/postprocess/smtr_postprocess.py#L72

Replace with

result_list.append((text, np.mean(conf_list).tolist()))