Holmeyoung / crnn-pytorch

Pytorch implementation of CRNN (CNN + RNN + CTCLoss) for all language OCR.
MIT License
377 stars 105 forks source link

Demo phase #9

Closed mariembenslama closed 5 years ago

mariembenslama commented 5 years ago

After training the model and all, I would like to ask when we enter a new image so that our model would recognize it.

Is it able to recognize any composition of image? I mean even a text the model has never seen (been trained on) before?

For example: We trained the model only on something like this:

AAA BBB CCC

And in the demo phase, we give it an image that contains ABC in it, will it be able to recognize it, or it can only recognize something it saw before?

Holmeyoung commented 5 years ago

Hi, yeah, it can recognize ABC!

  1. We can consider that if we need to give all the condition, then when len(alphabet) = 3000 and the txt_len = 10, we need 3000^10 sample. Of course it's impossible.

  2. You can think it as splited, whitch menas AAA can be splited as A, A, A, so as B, C... as for the reason why we need more data, it's we need the net to learn the right feature of the alphabet and diff different alphabets. Look at the encode function, you can see the labels are splited treated

    def encode(self, text):
        length = []
        result = []
        for item in text:            
            item = item.decode('utf-8','strict')
            length.append(len(item))
            for char in item:
                index = self.dict[char]
                result.append(index)
    
        text = result
        return (torch.IntTensor(text), torch.IntTensor(length))
mariembenslama commented 5 years ago

I see, so that's the work of CNN and RNN combined together, right?

Holmeyoung commented 5 years ago

Yes, you are right!