JaidedAI / EasyOCR

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
https://www.jaided.ai
Apache License 2.0
23.61k stars 3.1k forks source link

Fine tuning g2 latin / spanish model on my own dataset #741

Open emigomez opened 2 years ago

emigomez commented 2 years ago

Hi,

I see your instructions in https://github.com/JaidedAI/EasyOCR/blob/master/custom_model.md to train a model, but I'm not sure if these instructions can be used to fine tuning an existing easyocr model, or just to train it from models provided here https://github.com/clovaai/deep-text-recognition-benchmark

I want to fine tune the existing easyocr spanish model (used in reader = easyocr.Reader(['es'])) with my own dataset.

How can I do it? how can I indicate the easyocr model as a starting point in the https://github.com/clovaai/deep-text-recognition-benchmark train? or is better to use these scripts https://github.com/JaidedAI/EasyOCR/tree/master/trainer?

Thanks!

CamiloSaboA-csv commented 2 years ago

I'm having the same doubts!

iblub1 commented 2 years ago

You can download the latin model from the official website and configure it in the .yaml as a starting point

Model hub: https://www.jaided.ai/easyocr/modelhub/

in your config.yaml saved_model: "saved_models/latin/latin_g2.pth" (or whatever path you have)

SkygirlLuna commented 1 year ago

but how do you set the language and charset for training the latin_g2.pth?