PaddlePaddle / PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
Apache License 2.0
38.99k stars 7.32k forks source link

Det任务中,将训练过程模型推理测试结果为%30左右,训练结束加载模型推理结果全为0,请问是什么原因导致 #12015

Closed shaoping1t closed 1 week ago

shaoping1t commented 2 weeks ago

W0427 15:22:39.669593 1401 gpu_resources.cc:217] WARNING: device: . The installed Paddle is compiled with CUDNN 8.2, but CUDNN version in your machine is 8.1, which may cause serious incompatible bug. Please recompile or reinstall Paddle with compatible CUDNN version. 请问版本问题会有影响吗

UserWangZz commented 2 weeks ago

可以提供一下执行的命令,config等信息吗

shaoping1t commented 2 weeks ago

可以提供一下执行的命令,config等信息吗

Global: debug: false use_gpu: true epoch_num: 1 log_smooth_window: 20 print_batch_step: 10 save_model_dir: ./output/det_r50_sync800k/ save_epoch_step: 10 eval_batch_step: 2500 cal_metric_during_train: false pretrained_model: /root/autodl-tmp/res_pre.pdparams checkpoints: null save_inference_dir: null use_visualdl: false infer_img: /root/test_data/test_tu/ save_res_path: ./output/det_db_sync/predicts_db.txt Architecture: model_type: det algorithm: BifDB Transform: null Backbone: name: IFNet channels: 3 layers: 50 dcn_stage: [ False, True, True, True ] Neck: name: BiFPN out_channels: 256 use_asf: True Head: name: BiDBHead k: 50 Loss: name: BiDBLoss balance_loss: true main_loss_type: BCELoss alpha: 5 beta: 12 ohem_ratio: 3

Optimizer: name: AdamW beta1: 0.9 beta2: 0.99 epsilon: 1.e-8 weight_decay: 0.0001 lr:

name: Cosine

# name: DecayLearningRate
learning_rate: 0.00035
# factor: 0.999999
# warmup_epoch: 0

Optimizer:

name: Adam

beta1: 0.9

beta2: 0.999

epsilon: 0.00000008

lr:

name: DecayLearningRate

learning_rate: 0.00025

epochs: 1

factor: 0.9

end_lr: 0.00005

weight_decay: 0.0

warmup_epoch: 0

PostProcess: name: DBPostProcess thresh: 0.3 box_thresh: 0.5 max_candidates: 1000 unclip_ratio: 1.5 det_box_type: 'quad' # 'quad' or 'poly' Metric: name: DetMetric main_indicator: hmean Train: dataset: name: SimpleDataSet data_dir: /root/autodl-tmp/dec_data/sync/SynthText/ label_file_list:

你好请看一下,之前没有出现过这个问题,我是自定义了一些模块,在train的过程中eval是正常的,但是我直接eval指标全是0,权重参数的保存和读取时没有问题的,是否存在我自定义模块初始化时遇到问题

UserWangZz commented 2 weeks ago

这个需要你排查一下,是不是eval时的config配置错误