Closed atsju closed 2 years ago
makes sense 👍 Feel free to provide a PR with the change. Otherwise I'll change this, but might take some time.
I did ugly chnage on my side. It's not very clean.
Maybe consider also something to be able to add characters during (for example) transfert learning. I try to recognize german text from only 1 person but do not have much data so I learned on the IAM dataset and did transfert learning on some part of my German text. Unfortunatelly it will fait if I feed ä or ü because it's not in the original dataset. I didn't have a deep look for the moment on how to improve it as I still have to extend my GT data a bit.
Do not be surpised if I open many Issues, I just try to document the little things I see during my own learning process.
ok, as I said this one I'll implement in the future, but for the rest let's see. As the name SimpleHTR suggests I want to keep it as simple as possible ;-) ... even if this means lacking some features. The repo should be considered as a foundation for further developments, on which everyone can build his or her custom stuff.
changed the code, now in validation and inference mode the charset of the trained model is used.
Hi, I wonder if this can train the text line model, it seems that it can only train the single word model
I don't understand your question. Is this related to this issue? If so please explain in more detail.
How to improve accuracy?
after two years, im led to you guys. Great work Harald! whats the latest on this program?
https://github.com/githubharald/SimpleHTR/blob/7f26b321f8b8b18e5f60cbc1b7d5e1ad202e7487/src/main.py#L177
From my point of view validate should use char_list from the model instead from the loaded dataset. This way if the user uses a different dataset for validation it will work. In current implmentation, the charList depends of dataset and will create errors if the new dataset has not exactly same charlist as the learning dataset.