Closed coorful closed 5 years ago
@MichalBusta
you can add: import sys, traceback traceback.print_exc(file=sys.stdout) after 'except:'
to see what is going on
Thank you so much for your reply,it seems that in ocr_test_utils.py this fuction needs four outputs but just unpack three,when i add one output det_text, conf, decs, =print_seq_ext() the problem solves! thanks you ~
besides, i have one question to ask for your help~ can i just use the ocr model to train a seperate word recognition model(just to achieve recognition task).if i can do like this,how large dataset should i have?(i just want to test on icdar2015 word recognition dataset ) thank you ! @MichalBusta
Could you please give me some advice?Thanks a lot~ @MichalBusta
Hi,
On 29/04/2019 06:08, cooooor wrote:
besides, i have one questions to ask for your help~ can i just use the ocr model to train a seperate word recognition model(just to achieve recognition task).
sure, there is https://github.com/MichalBusta/E2E-MLT/blob/master/train_ocr.py script just for training ocr module.
if i can do like this,how large dataset should i have?(i just want to test on icdar2015 word recognition dataset )
hard to say - depends on data - (sythetic images - VGG group have been using 9 million imagescovering90k English words for ICDAR 2013 dataset, real images ~ 100000 from icdar2015 and icdar2017 MLT will give you quite good baseline ... )
thank you ! @MichalBusta https://github.com/MichalBusta
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/MichalBusta/E2E-MLT/issues/30#issuecomment-487446893, or mute the thread https://github.com/notifications/unsubscribe-auth/AA7KHMBDMF7FKWEYNRYEKO3PSZYFLANCNFSM4HI6T3FA.
hello,when i use the dataset from IC15、IC17MLT&IC19MLT(only use the latin words,about 70000word images),and only run the train_ocr.py,but the accuracy on ic15 test word dataset can just achieve 56.3% accuracy ,the batchsize i used equals to 4,could you please give me some advice why the accuracy is so low,and what should i do to improve it ?
Thanks a lot~ @MichalBusta
No easy answer sorry :)
ok,i will try.thanks so much for your reply~
@MichalBusta I also have some questions regarding your text recognition model.
1/ For latin languages, did you train your text recognition models with single word images only (i.e. no text lines)?
2/ How many images did you train your text recognition model with?
3/ Your text recognition model seems to use a ResNet like structure. Since this project focus on real-time, have you tried to train your text recognition model with a MobileNet backbone?
pondělí 8. července 2019 Alex Lee notifications@github.com napsal(a):
@MichalBusta https://github.com/MichalBusta I also have some questions regarding your text recognition model.
1/ For latin languages, did you train your text recognition models with single word images only (i.e. no text lines)?
most of the images in used datasets are word level, some of the generated images are line-level
2/ How many images did you train your text recognition model with?
sorry I'm travelling so I can not give you exact number but it will be about 500k
3/ Your text recognition model seems to use a ResNet like structure.
no - text detection is resnet-like. recognition is plain vgg style.
Since this project focus on real-time, have you tried to train your text recognition model with a MobileNet backbone?
no - it is about the hardware - we were targeting low-end GPUs.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/MichalBusta/E2E-MLT/issues/30?email_source=notifications&email_token=AA7KHMBNKZJHN5VBB7HNKVLP6MGUVA5CNFSM4HI6T3FKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZMTZBY#issuecomment-509164679, or mute the thread https://github.com/notifications/unsubscribe-auth/AA7KHMBKI7Q4ILDNOGM7G7LP6MGUVANCNFSM4HI6T3FA .
Thanks @MichalBusta ! One more thing, for the 500k images, it consists of an equal share between Arabic, Bangla, Chinese, Japanese, Korean and Latin (i.e. ~80k images for each script)?
+/- : we have used more real latin and chinese data since there are a datasets. For synth equal split.
středa 10. července 2019 Alex Lee notifications@github.com napsal(a):
Thanks @MichalBusta https://github.com/MichalBusta ! One more thing, for the 500k images, it consists of an equal share between Arabic, Bangla, Chinese, Japanese, Korean and Latin (i.e. ~80k images for each script)?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/MichalBusta/E2E-MLT/issues/30?email_source=notifications&email_token=AA7KHMB3S4M2WPH4W2NTMXTP6VVLNA5CNFSM4HI6T3FKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZSJ2SA#issuecomment-509910344, or mute the thread https://github.com/notifications/unsubscribe-auth/AA7KHMDUJNUH37FR3L3JAB3P6VVLNANCNFSM4HI6T3FA .
hello,i'm sorry to bother you when i run the train_ocr.py, it always show the bad image message, the validation dataset i used is part of ic15 test word images,could you please tell me why this codition happens? thanks a lot!!!