clovaai / deep-text-recognition-benchmark

Text recognition (optical character recognition) with deep learning methods, ICCV 2019
Apache License 2.0
3.77k stars 1.11k forks source link

RuntimeError: The expanded size of the tensor #380

Closed NastiaZay closed 1 year ago

NastiaZay commented 1 year ago

Hello , i'm trying to train a model using 12mil synthetic images but i'm getting the following error: RuntimeError: The expanded size of the tensor (26) must match the existing size (32) at non-singleton dimension 0. Target sizes: [26]. Tensor sizes: [32]

exp_name: TPS-ResNet-BiLSTM-Attn-Seed777 train_data: /root/TRAIN valid_data: /root/VALID manualSeed: 777 workers: 24 batch_size: 1152 num_iter: 50000 valInterval: 500 saved_model: FT: False adam: False lr: 1 beta1: 0.9 rho: 0.95 eps: 1e-08 grad_clip: 5 baiduCTC: False select_data: ['/'] batch_ratio: ['1'] total_data_usage_ratio: 1.0 batch_max_length: 25 imgH: 50 imgW: 130 rgb: True character: !"#$%&\'()*+,-./0123456789:;=?@ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz sensitive: True PAD: False data_filtering_off: True Transformation: TPS FeatureExtraction: ResNet SequenceModeling: BiLSTM Prediction: Attn num_fiducial: 20 input_channel: 3 output_channel: 512 hidden_size: 256 num_gpu: 6 num_class: 150

i have no idea how to fix it.

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5 python3.8 train.py --train_data /root/TRAIN --valid_data /root/VALID --Transformation TPS --FeatureExtraction ResNet --SequenceModeling BiLSTM --Prediction Attn --rgb --manualSeed 777 --imgH 50 --imgW 130 --sensitive --batch_max_length 25 --num_fiducial 20 --input_channel 3 --output_channel 512 --hidden_size 256 --num_iter 50000 --valInterval 500 --data_filtering_off

after this output

[1/50000] Train loss: 5.04899, Valid loss: 4.97295, Elapsed_time: 22.89128 Current_accuracy : 0.769, Current_norm_ED : 0.02 Best_accuracy : 0.769, Best_norm_ED : 0.02

Ground Truth | Prediction | Confidence Score & T/F

SENDER: | U=======;;;;;;;;;=====;;; | 0.0000 False rrIR | G=====;;;;;;;;;;;;;;;;;;; | 0.0000 False baldwins | G=======;;;;;;=========== | 0.0000 False UP | ======;;;;;;;;;;;;;;;;;== | 0.0000 False Flavor | G====;;;;;;;;;;;====;;;;; | 0.0000 False

im getting The expanded size of the tensor (26) must match the existing size (32) at non-singleton dimension 0. Target sizes: [26]. Tensor sizes: [32]

NastiaZay commented 1 year ago

anyone can help ?

Traceback (most recent call last): File "train.py", line 318, in train(opt) File "train.py", line 150, in train text, length = converter.encode(labels, batch_max_length=opt.batch_max_length) File "/workspace/bad/utils.py", line 137, in encode batch_text[i][1:1 + len(text)] = torch.LongTensor(text) # batch_text[:, 0] = [GO] token The expanded size of the tensor (26) must match the existing size (32) at non-singleton dimension 0. Target sizes: [26]. Tensor sizes: [32]

NastiaZay commented 1 year ago

figure it out, dont use --data_filtering_off with [UNK] token.