PaddlePaddle / PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
https://paddlepaddle.github.io/PaddleOCR/
Apache License 2.0
42.78k stars 7.69k forks source link

windows cpu版本使用自己的数据集(15条数据)训练时 acc一直处于0.000000 #9993

Closed Shy280 closed 1 year ago

Shy280 commented 1 year ago

在训练手写文字识别时遇到acc精度一直都是0的状态 没有丝毫提示,使用的预训练库为paddleocr官方提供的ch_ppocr_server_v2.0_rec_pre 这是我的yum配置文件: `Global: use_gpu: False epoch_num: 100 log_smooth_window: 20 print_batch_step: 1 save_model_dir: ./output/rec_chinese_common_v2.0 save_epoch_step: 3

evaluation is run every 5000 iterations after the 4000th iteration

eval_batch_step: [0, 2000]

if pretrained_model is saved in static mode, load_static_weights must set to True

cal_metric_during_train: True pretrained_model: D:\java\PaddleOCR-train\trains\chPPocr\best_accuracy checkpoints: save_inference_dir: use_visualdl: False infer_img: doc/imgs_words/ch/word_1.jpg

for data or label process

character_dict_path: D:\java\PaddleOCR-train\ppocr\utils\ppocr_keys_v1.txt character_type: ch max_text_length: 1000 infer_mode: False use_space_char: True save_res_path: ./output/rec/predicts_r34_vd_none_bilstm_ctc.txt

Optimizer: name: Adam beta1: 0.9 beta2: 0.999 lr: name: Cosine learning_rate: 0.001 regularizer: name: 'L2' factor: 0.00004

Architecture: model_type: rec algorithm: CRNN Transform: Backbone: name: ResNet layers: 34 Neck: name: SequenceEncoder encoder_type: rnn hidden_size: 256 Head: name: CTCHead fc_decay: 0.00004

Loss: name: CTCLoss

PostProcess: name: CTCLabelDecode

Metric: name: RecMetric main_indicator: acc

Train: dataset: name: SimpleDataSet data_dir: D:\java\PaddleOCR-train\trains\recTrain label_file_list: D:\java\PaddleOCR-train\trains\recTrain\Label.txt transforms:

Eval: dataset: name: SimpleDataSet data_dir: D:\java\PaddleOCR-train\trains\recTrain label_file_list: D:\java\PaddleOCR-train\trains\recTrain\Label-test.txt transforms:

我尝试过等待,直到其运行到12/100次后仍然还是0.000000 这是我的label文件 是通过paddleocr自带的标注程序生成的

imgs/lgl.jpg [{"transcription": "刘光烈", "points": [[12, 9], [154, 9], [154, 66], [12, 66]], "difficult": false}] imgs/lgw.jpg [{"transcription": "罗国伟", "points": [[8, 0], [168, 5], [170, 75], [11, 76]], "difficult": false}] imgs/pys.jpg [{"transcription": "番禺所", "points": [[7, 9], [111, 15], [109, 56], [5, 50]], "difficult": false}] imgs/xmjl.jpg [{"transcription": "项目经理", "points": [[2, 5], [131, 2], [139, 44], [2, 49]], "difficult": false}] imgs/ywjswgn.jpg [{"transcription": "遗忘就是我给你", "points": [[9, 20], [401, 24], [400, 89], [1, 87]], "difficult": false}] imgs/zhdjn.jpg [{"transcription": "最好的纪念", "points": [[21, 20], [283, 7], [287, 93], [24, 105]], "difficult": false}] imgs/ztdz.jpg [{"transcription": "众通电子", "points": [[6, 7], [151, 7], [151, 52], [6, 52]], "difficult": false}] imgs/zy.jpg [{"transcription": "张瑜", "points": [[7, 0], [110, 0], [110, 57], [7, 57]], "difficult": false}] imgs/zzsj.jpg [{"transcription": "总支书记", "points": [[10, 7], [204, 7], [204, 70], [10, 70]], "difficult": false}] imgs/gzs.png [{"transcription": "广州所", "points": [[2, 6], [132, 6], [132, 59], [2, 59]], "difficult": false}] imgs/gzzx.jpg [{"transcription": "广州中心", "points": [[15, 9], [182, 9], [182, 65], [15, 65]], "difficult": false}] imgs/gzzxx.jpg [{"transcription": "广州中心", "points": [[11, 8], [197, 8], [197, 73], [11, 73]], "difficult": false}] imgs/hdrbxyzj.jpg [{"transcription": "很多人不需要再见", "points": [[17, 3], [401, 3], [401, 68], [17, 68]], "difficult": false}] imgs/sj.jpg [{"transcription": "书记", "points": [[25, 10], [94, 10], [94, 42], [25, 42]], "difficult": false}] imgs/ywzslgey.jpg [{"transcription": "因为只是路过而已", "points": [[20, 14], [401, 14], [401, 76], [20, 76]], "difficult": false}] test文件也是如上格式 并不明白是什么原因 我也尝试过使用icdar2015, 但是acc也是0.000000,请问大家有遇到过这种情况吗

Shy280 commented 1 year ago

@WenmuZhou @tink2123

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.