MichalBusta / DeepTextSpotter

285 stars 101 forks source link

size mismatch while training #76

Open ustczhouyu opened 5 years ago

ustczhouyu commented 5 years ago

Help, Help!

I add some layers in both model_cz.prototxt and tiny.prototxt, when I train the model by python train.py, one error occur: valueError: cannot reshape array of size 6204 into shape(22,1,141).This error happens in 260th line in validation.py, that is ctc_f = ctc_f.reshape(ctc_f.shape[0], ctc_f.shape[1], ctc_f.shape[3]). As we know, 6204=2221141, so I want to know than can I change ctc_f = ctc_f.reshape(ctc_f.shape[0], ctc_f.shape[1], ctc_f.shape[3]) to ctc_f = ctc_f.reshape(ctc_f.shape[0], 2ctc_f.shape[1], ctc_f.shape[3]) so that the result is (22,2,141)? Can somebody help me? Thank you very much.

MichalBusta commented 5 years ago

Hi, the output of OCR should reduce dim of height to one - so you have just 2D sequence at the end (N C 1 W). So check padding etc. on new layers ...

ustczhouyu commented 5 years ago

Hi Michal Busta, I want to re-implement your model and I also want to add some attention mechanism, I have some difficulties and I need your help, please. 1.what is the following GenericVocabulary.txt in validation.py if I want to validate icdar2013-test? It is ok if I set it to synthtext/dict_voc.txt?

def validate(nets, dataloader, image_size = [480, 480], split_words = True): cmp_trie.load_dict('/home/busta/data/icdar2013-Test/GenericVocabulary.txt')

  1. If I want to replace ctc to lstm layer+attention mechanism in the model_cz.prototxt, do I still need to change somewhere else(for example loss function ..)?

Sincerely hope to get your reply, thank you!

At 2018-11-15 16:45:36, "Michal Busta" notifications@github.com wrote:

Hi, the output of OCR should reduce dim of height to one - so you have just 2D sequence at the end (N C 1 W). So check padding etc. on new layers ...

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

ustczhouyu commented 5 years ago

Hi Michal Busta, I want to re-implement your model and I also want to add some attention mechanism, I have some difficulties and I need your help, please.

  1. How to pre-train the detection CNN using the SynthText dataset, pre-trained the recognition CNN on the Synthetic Word dataset? I just know that python train.py can train both detection CNN and recognition CNN, but I don't know how to train them Individually.

2.what is the following GenericVocabulary.txt in validation.py if I want to validate icdar2013-test? It is ok if I set it to synthtext/dict_voc.txt?

def validate(nets, dataloader, image_size = [480, 480], split_words = True): cmp_trie.load_dict('/home/busta/data/icdar2013-Test/GenericVocabulary.txt')

  1. If I want to replace ctc to lstm layer+attention mechanism in the model_cz.prototxt, do I still need to change somewhere else(for example loss function ..)?

Sincerely hope to get your reply, thank you!

At 2018-11-15 16:45:36, "Michal Busta" notifications@github.com wrote:

Hi, the output of OCR should reduce dim of height to one - so you have just 2D sequence at the end (N C 1 W). So check padding etc. on new layers ...

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.