adeline-cs / GTR

Scene text recognition
Apache License 2.0
106 stars 14 forks source link

Embedding vector and Language model #2

Closed bharatsubedi closed 2 years ago

bharatsubedi commented 2 years ago

Hello, Thank you for the code release. I want to try your network for the Korean and Japanese languages. But when I checked your code for data preparation we have to prepare a word embedding vector. could you please let me know how can I prepare an Embedding vector and language model for Korean and Japanese languages?

bharatsubedi commented 2 years ago

I found the solution about preparing Embedding vector. But try to train using prepared data there are lots of errors on code. so that I failed to train the network. could you please upload clean code?

adeline-cs commented 2 years ago

@bharatsubedi Hi, this version code and datasets now only provide English recognition, so if you want STR for Korean and Japanese languages, the recognition vocabulary and language reasoning model need to be adaptive for these languages. And thanks for your question, we can provide some steps in the README about how to change to other language training.

bharatsubedi commented 2 years ago

@adeline-cs Thank you I am not successful in training the English language also because code has lots of bugs. I hope you will upload code that is clean and can train easily.

lerndeep commented 2 years ago

@adeline-cs waiting for the steps to train another language than English and way to make adaptive recognition vocabulary and language reasoning model