Poor performance 😞 after training on custom data ℹ️

amir2628 commented 1 year ago

Greetings! 👋🏼

I used EasyOCR repository ( their trainer.ipynb and not the deep-text-recognition-benchmark), to train my custom model. But since it is based on your repository, I think you might be able to help me. 😢

I hope the issue would not be lost in the void😢

So the dataset files where the collection of images and the label.csc files and did not use the create_lmdb_dataset.py to create the datasets. But as the title indicates, the performance that I saw during inference was not even close to the training and validation. 🤨

Can you please give some insights to how is that the accuracy can be as good as >90% with really low validation loss like <0.01, but when using the trained model in production ( easyocr.Reader ) the extracted text is just nonsense and not even close to the actual text in the image? :confused:
I saw some comments to other issues like this that perhaps using a dataset close to your domain would help, which I used similar images for both training, validation, and inference, but still no changes. ❌ : 🙅🏼‍♂️
Moreover, if you just train one model for let's say 30000 iterations, get the model.pth and train it again for another 30000 iterations, would it ultimately make the model better? :suspect:
In conclusion, I would like to know all of your opinions (specially from the contributors of this repository since they know better what they developed) on why the performance in inference is worse than what it shows in the training process. 🤝
If it helps to provide you with anything, let me know. 🗒️
Also be noted that before giving any image to the model upon inference, I do image processing to make sure that the image is more readable for the model. 😏

Have a good one!

amir2628 commented 1 year ago

So I went and checked the code to see if the weights are being taken from the trained model or just being initialized randomly:

turns out there is the following part in the recognition.py of EasyOCR:

    if device == 'cpu':
        state_dict = torch.load(model_path, map_location=device)
        new_state_dict = OrderedDict()
        for key, value in state_dict.items():
            new_key = key[7:]
            new_state_dict[new_key] = value
        model.load_state_dict(new_state_dict)

Which I am also on CPU. So I went and compared the dicts and it seems that they are the same:

        # Compare state_dict and new_state_dict
        differing_indices = []
        for idx, (key, state_value) in enumerate(state_dict.items()):
            new_value = new_state_dict.get(key, None)
            if new_value is not None and not torch.equal(state_value, new_value):
                differing_indices.append(idx)

        if len(differing_indices) == 0:
            print("The state_dict and new_state_dict are the same.")
        else:
            print("The state_dict and new_state_dict differ at the following indices:")
            print(differing_indices)

Output: The state_dict and new_state_dict are the same.

So I'm not sure what can cause this... :suspect:

Any ideas? 💭

CharlyJazz commented 3 months ago

Did you make any fix foor this? I am facing same problem

clovaai / deep-text-recognition-benchmark

Poor performance 😞 after training on custom data ℹ️ #392

I hope the issue would not be lost in the void😢

So I went and checked the code to see if the weights are being taken from the trained model or just being initialized randomly: