Open tjpulfn opened 5 years ago
The pretrained model uses English letters I'm afraid. If you wish to use it with Chinese you will have to retrain it.
yes, i train the model with Chinese, but the loss higher, for exampe:
[21] Iteration loss: 388.9223213195801 [22] Iteration loss: 386.60620498657227 [23] Iteration loss: 384.27929306030273 [24] Iteration loss: 382.09375 [25] Iteration loss: 380.0574035644531 [26] Iteration loss: 378.1801071166992 [27] Iteration loss: 376.39180755615234 [28] Iteration loss: 374.7563133239746
and the speed is very very very slowly, is this normal?
Yes, the network in itself is quite long to train I'm afraid.
Also, the data feeding system I used (custom batches + feed_dict) in this is terrible so the training is slow.
hello, when the model trained with Chinese, firstly, https://github.com/Belval/CRNN/blob/51b2ebe6c8d7a0dec6df1339bf404507301229a3/CRNN/crnn.py#L195 result is : 天 败 袁 唐 铛 董 撰按漱按氯按氯按网按网残 蒿 爵 樟 怕 鸿 狙 按氯按氯按氯按氯按网按网 柯 岸 邱 亚 可 块 悯哥喻哥 哥 哥 贝 纳 鲤 讯 濯 捕 纣悯哥沃哥 沃 沃 铃 丫 征 贿 琴 使 齐 齐 齐 齐 昙 常 美 养 明 圆 齐 齐 齐
and then, the result is : 锯 祷 沫 官 声 慢
玲 交 讽 砷 菇 蜡
质 唁 伪 咏 袋 紫
砸 头 倦 哨 躬 液
泪 轿 沼 厩 浑 充
士 救 晏 莎 辅 宋
矢 拖 流 慷 稗 桥
阿 逮 凤 杀 翊 款
腰 有 叨 更 丐 蜈
悼 闺 咋 询 嘁 咋
芯 侣 玉 奏 钠 伶
纬 构 邮 谅 指 竟
箴 廉 妆 坚 叔 隔
and the 'decoded' is null, so why is it, what's wrong with me?
Hi,
Make sure that you edited the CHAR_VECTOR string before training.
Regards
@tjpulfn hi! I have a similar problem to yours when i use this model to train with chinese.Can you tell me how to solve this question finally? Thank you!
use the pre-train under windows 10 python run.py -ex ..\samples --test --restore empty result, 1-10 is the files when conduct data_loading, but no result after testing predict.
Loading data 1 2 3 4 5 6 7 8 9 10 Testing
hello ,i will trouble you again. the pretrained model can be tested using chinese? when i test in chinese it has the error
File "/Users/liufengnan/workspace/OCR/CRNN/CRNN/utils.py", line 48, in <listcomp> return [config.CHAR_VECTOR.index(x) for x in label] ValueError: substring not found
and then i change the CHAR_VECTOR in config.py use chinese characters. have error with shapeInvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [512,3992] rhs shape= [512,70] [[Node: save/Assign = Assign[T=DT_FLOAT, _class=["loc:@W"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](W, save/RestoreV2)]]
and can yue understand my english, it is poor for me