pretrained model - Githubissues

Belval / CRNN

A TensorFlow implementation of https://github.com/bgshih/crnn

MIT License

297 stars 101 forks source link

pretrained model #31

Open tjpulfn opened 5 years ago

tjpulfn commented 5 years ago

hello ,i will trouble you again. the pretrained model can be tested using chinese? when i test in chinese it has the error File "/Users/liufengnan/workspace/OCR/CRNN/CRNN/utils.py", line 48, in <listcomp> return [config.CHAR_VECTOR.index(x) for x in label] ValueError: substring not found and then i change the CHAR_VECTOR in config.py use chinese characters. have error with shape InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [512,3992] rhs shape= [512,70] [[Node: save/Assign = Assign[T=DT_FLOAT, _class=["loc:@W"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](W, save/RestoreV2)]]

and can yue understand my english, it is poor for me

Belval commented 5 years ago

The pretrained model uses English letters I'm afraid. If you wish to use it with Chinese you will have to retrain it.

tjpulfn commented 5 years ago

yes, i train the model with Chinese, but the loss higher, for exampe: [21] Iteration loss: 388.9223213195801 [22] Iteration loss: 386.60620498657227 [23] Iteration loss: 384.27929306030273 [24] Iteration loss: 382.09375 [25] Iteration loss: 380.0574035644531 [26] Iteration loss: 378.1801071166992 [27] Iteration loss: 376.39180755615234 [28] Iteration loss: 374.7563133239746

and the speed is very very very slowly, is this normal?

Belval commented 5 years ago

Yes, the network in itself is quite long to train I'm afraid.

Also, the data feeding system I used (custom batches + feed_dict) in this is terrible so the training is slow.

tjpulfn commented 5 years ago

hello, when the model trained with Chinese, firstly, https://github.com/Belval/CRNN/blob/51b2ebe6c8d7a0dec6df1339bf404507301229a3/CRNN/crnn.py#L195 result is ：天败袁唐铛董撰按漱按氯按氯按网按网残蒿爵樟怕鸿狙按氯按氯按氯按氯按网按网柯岸邱亚可块悯哥喻哥哥哥贝纳鲤讯濯捕纣悯哥沃哥沃沃铃丫征贿琴使齐齐齐齐昙常美养明圆齐齐齐

and then, the result is ：锯祷沫官声慢

玲交讽砷菇蜡

质唁伪咏袋紫

砸头倦哨躬液

泪轿沼厩浑充

士救晏莎辅宋

矢拖流慷稗桥

阿逮凤杀翊款

腰有叨更丐蜈

悼闺咋询嘁咋

芯侣玉奏钠伶

纬构邮谅指竟

箴廉妆坚叔隔

and the 'decoded' is null, so why is it, what's wrong with me?

Belval commented 5 years ago

Hi,

Make sure that you edited the CHAR_VECTOR string before training.

Regards

StromWine commented 5 years ago

@tjpulfn hi! I have a similar problem to yours when i use this model to train with chinese.Can you tell me how to solve this question finally? Thank you!

kienchen commented 5 years ago

use the pre-train under windows 10 python run.py -ex ..\samples --test --restore empty result, 1-10 is the files when conduct data_loading, but no result after testing predict.

Loading data 1 2 3 4 5 6 7 8 9 10 Testing