baudm / parseq

Scene Text Recognition with Permuted Autoregressive Sequence Models (ECCV 2022)
https://huggingface.co/spaces/baudm/PARSeq-OCR
Apache License 2.0
565 stars 126 forks source link

Retraining, but accuracy is too low on unknown data. #106

Open Sydeboy opened 1 year ago

Sydeboy commented 1 year ago

My configuration is as follows: batch_size=384, epoch=20, val_check_interval=500, gpu:3090,others are the default configuration, charset_train=62_mixed-case charset_test = string.digits + string.ascii_lowercase + string.ascii_uppercase I have a few questions to ask you

  1. My dataset format is strictly following your format. My data are all characters, only numbers, uppercase and lowercase English. My dataset split ratio is 8:1:1. details as follows
data
----train
--------real
------------D001
----------------train
----------------val
----val
--------D004
----test
--------D001
--------D004

In your paper, I see that real data and val data sets are not divided under the same data set. Can I understand that the data set under data/val is only used for verification and does not participate in training? According to my guess, I placed my divided data set, that is, the training set, in the real directory, the divided test set in the test directory, and different data sets in the data/val directory. For example, I trained the D001 dataset and placed D004 under data/val. The final test is D001 and D004. The accuracy of D001 is high, but the accuracy of D004 is very low. I don't quite understand the role of the two vals in the data directory, can you explain it, thank you!

  1. Another question is, can I use all your data sets plus my own data set for training, using charset_train=62_mixed-case? But what I am worried about is that in the demo of hugging face, I used your pre-trained weights to predict my pictures and recognized punctuation marks, but there are no punctuation marks in my data set. image

image

What should I do about it?

  1. Does the charset used in the test have to be 32_lowercase?