Open ghost opened 4 years ago
Hi,
these are two different things, recognize sentence and segment sentences. Just add blank in the labels is not recommended.
just make alphabet contain all English and Chinese characters, like what you say.
Calculate the last lstm T length. The longer you resize image width to be, the longer you can train with. One location for one word.
Thank you so much for your help.
How to recognize blank between two English words? for my current model, if I input one English sentence then the output will concatenate all the English words together. for example: inputed image:
recognized result:
A------l-ll-t-h--e--r-e-c--o--g-n-i-tiio---n--a-c-cc--ur-a--c-i-e-s---o--n--t-h---e => Alltherecognitionaccuraciesonthe
So how to recognize the blank between two English words?
Try to give the label
all#the#recognition#accuracies#on#the
replace all the blanks with #, and put the word # in alphabets.py
So, when there is blank, the net will output #, and you can replace # with blank, you will get normal sentences.
You can try as this, but i am not sure about it.
thanks for your reply. maybe blank itself can also be considered as a character. So currently I decide not to replace blank with #. I decide to add the blank character itself to alphabet and train with English sentences. wait for my results. thank you so much.
@Holmeyoung
it is not necessary to replace blank with #. just view blank itself as one character and add blank to the alphabet. then prepare for English sentences as training data.
alphabet = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'. -,"
my training images:
the following are training progress:
you can find that it works. finally, I'm very grateful to you for your responses to all my questions. Thank you again.
@cvchongci Hi, I am also having problems with the space between words in English. could you please share your model ??? thanks!!!
@ducbluee Hi, I used very limited synthetic data to train the model. so the model does not work well on real-world images. you can follow the way I handle blank.
Firstly, you codes are great. I trained with SynthText90k dataset and achieved very good performance on English words.
there are several questions. hopefully you can give me a hand. Thank you very much. thanks for your time.
How to recognize blank in one sentence? for example,I want to recognize "I love python" there is blank between I and love. how to handle this problem? just add blank in alphabet? like this? and prepare for the training data
alphabet = """0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ """
Can we recognize English and Chinese in one model? if we want to recognize English and Chinese in one model, how to do? just make alphabet contain all English and Chinese characters? just like this?
alphabet = """0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ是不我一有大在人了中到資..."""
if we want to recognize very long sentence? do you think it would be better to train with very long sentences or we can just train with short sentence? because your current model only support text length less than 26. so have to modify the network if I want to support training with long sentence.