Closed morestart closed 1 year ago
有对配置文件做过什么修改吗?看样子是严重过拟合了
@andyjpaddle
Global:
use_gpu: true
epoch_num: 600
log_smooth_window: 20
print_batch_step: 10
save_model_dir: ./output/rec/ic15/
save_epoch_step: 3
# evaluation is run every 2000 iterations
eval_batch_step: [0, 2000]
cal_metric_during_train: True
pretrained_model:
checkpoints:
save_inference_dir: ./
use_visualdl: False
infer_img: doc/imgs_words_en/word_10.png
# for data or label process
character_dict_path: /home/cat/Desktop/PO/dict.txt
max_text_length: 25
infer_mode: False
use_space_char: False
save_res_path: ./output/rec/predicts_ic15.txt
Optimizer:
name: Adam
beta1: 0.9
beta2: 0.999
lr:
learning_rate: 0.0005
regularizer:
name: 'L2'
factor: 0
Architecture:
model_type: rec
algorithm: CRNN
Transform:
Backbone:
name: MobileNetV3
scale: 0.5
model_name: large
Neck:
name: SequenceEncoder
encoder_type: rnn
hidden_size: 96
Head:
name: CTCHead
fc_decay: 0
Loss:
name: CTCLoss
PostProcess:
name: CTCLabelDecode
Metric:
name: RecMetric
main_indicator: acc
Train:
dataset:
name: SimpleDataSet
data_dir: /home/cat/Desktop/PO/icdar2015/recognition/train
label_file_list: ["/home/cat/Desktop/PO/train.txt"]
transforms:
- DecodeImage: # load image
img_mode: BGR
channel_first: False
- CTCLabelEncode: # Class handling label
- RecResizeImg:
image_shape: [3, 32, 100]
- KeepKeys:
keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order
loader:
shuffle: True
batch_size_per_card: 256
drop_last: True
num_workers: 8
use_shared_memory: False
Eval:
dataset:
name: SimpleDataSet
data_dir: /home/cat/Desktop/PO/icdar2015/recognition/test
label_file_list: ["/home/cat/Desktop/PO/val.txt"]
transforms:
- DecodeImage: # load image
img_mode: BGR
channel_first: False
- CTCLabelEncode: # Class handling label
- RecResizeImg:
image_shape: [3, 32, 100]
- KeepKeys:
keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order
loader:
shuffle: False
drop_last: False
batch_size_per_card: 256
num_workers: 4
use_shared_memory: False
数据集是用的icdar2015数据集吗?字典也是对应的官方的吗?我看你主要改了epoch,数据集路径和字典
数据集用了WenmuZhou提供的ICDAR2015数据集 我只是删除了空格和一个特殊字符
你可以试试按照默认的epoch数72训练,600轮太多了
72ep有点惨
[2022/02/11 10:45:10] root INFO: epoch: [72/600], iter: 1210, lr: 0.000500, loss: 16.296036, acc: 0.000000, norm_edit_dis: 0.170458, reader_cost: 0.03949 s, batch_cost: 0.06590 s, samples: 1024, ips: 1553.80111
[2022/02/11 10:45:11] root INFO: epoch: [72/600], iter: 1220, lr: 0.000500, loss: 16.415695, acc: 0.000000, norm_edit_dis: 0.167776, reader_cost: 0.00006 s, batch_cost: 0.04214 s, samples: 2560, ips: 6074.67093
[2022/02/11 10:45:11] root INFO: epoch: [72/600], iter: 1223, lr: 0.000500, loss: 16.225019, acc: 0.000000, norm_edit_dis: 0.169712, reader_cost: 0.00002 s, batch_cost: 0.01245 s, samples: 768, ips: 6169.11002
[2022/02/11 10:45:11] root INFO: save model in ./output/rec/ic15/latest
[2022/02/11 10:45:11] root INFO: save model in ./output/rec/ic15/iter_epoch_72
你有试过不进行任何改动进行训练吗?
没 我试试 稍等
仅修改数据集路径
[2022/02/11 11:10:41] root INFO: epoch: [72/72], iter: 1210, lr: 0.000500, loss: 15.314871, acc: 0.000000, norm_edit_dis: 0.204160, reader_cost: 0.04008 s, batch_cost: 0.07538 s, samples: 1024, ips: 1358.38837
[2022/02/11 11:10:42] root INFO: epoch: [72/72], iter: 1220, lr: 0.000500, loss: 15.505561, acc: 0.000000, norm_edit_dis: 0.200902, reader_cost: 0.00006 s, batch_cost: 0.06478 s, samples: 2560, ips: 3951.59575
[2022/02/11 11:10:43] root INFO: epoch: [72/72], iter: 1223, lr: 0.000500, loss: 15.115753, acc: 0.000000, norm_edit_dis: 0.204598, reader_cost: 0.00002 s, batch_cost: 0.01803 s, samples: 768, ips: 4260.71118
我的意思是数据集不要改,字典也不要改,看一下结果如何
这个配置上的数据不会自动下载吧,直接运行报错
FileNotFoundError: [Errno 2] No such file or directory: './train_data/ic15_data/rec_gt_train.txt'
INFO 2022-02-11 11:17:49,810 launch_utils.py:341] terminate all the procs
ERROR 2022-02-11 11:17:49,810 launch_utils.py:602] ABORT!!! Out of all 1 trainers, the trainer process with rank=[0] was aborted. Please check its log.
参考这个教程 https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.4/doc/doc_ch/recognition.md 这个里面包含数据集下载方式
这个配置上的数据不会自动下载吧,直接运行报错
FileNotFoundError: [Errno 2] No such file or directory: './train_data/ic15_data/rec_gt_train.txt' INFO 2022-02-11 11:17:49,810 launch_utils.py:341] terminate all the procs ERROR 2022-02-11 11:17:49,810 launch_utils.py:602] ABORT!!! Out of all 1 trainers, the trainer process with rank=[0] was aborted. Please check its log.
不会自动下载,需要自己手动下载,参考上面教程
ok
兄弟,找到原因了吗?我之前也遇到过这种情况
没 我又换了DTRB的lmdb的数据集 一个ep还没结束acc就到了80%以上,但是泛化很差.... 不知道为什么
我也不是特别懂,是不是数据集太小了
------------------ 原始邮件 ------------------ 发件人: "PaddlePaddle/PaddleOCR" @.>; 发送时间: 2022年2月11日(星期五) 下午3:55 @.>; @.**@.>; 主题: Re: [PaddlePaddle/PaddleOCR] 在ICDAR2015上测试,训练准确率高,val特别低 (Issue #5443)
没 我又换了DTRB的lmdb的数据集 一个ep还没结束acc就到了80%以上,但是泛化很差.... 不知道为什么
— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you commented.Message ID: @.***>
DTRB的数据集我感觉不小了啊
评估的一些参数设置要跟训练时的参数保持一致,会不会是这里出了问题
------------------ 原始邮件 ------------------ 发件人: "PaddlePaddle/PaddleOCR" @.>; 发送时间: 2022年2月11日(星期五) 下午5:04 @.>; @.**@.>; 主题: Re: [PaddlePaddle/PaddleOCR] 在ICDAR2015上测试,训练准确率高,val特别低 (Issue #5443)
DTRB的数据集我感觉不小了啊
— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you commented.Message ID: @.***>
我这里的指标是val不是test的 我邮箱morestarforyou@gmail.com 咱俩可以交流一下...
@morestart @guoge6 两位你们的问题解决了吗?我现在也遇到了同样的问题。
没 后来自己写了一套 没用paddle
我也遇到了这个问题
对于ICDAR2015数据集,它的label存在大写字母,而ic15_dict.txt只有数字和小写字母,导致label加载有误,可以尝试./ppocr/data/imaug/label_ops.py中添加: """ class CTCLabelEncode(BaseRecLabelEncode): """ Convert between text-label and text-index """
def __init__(self,
max_text_length,
character_dict_path=None,
use_space_char=False,
**kwargs):
super(CTCLabelEncode, self).__init__(
max_text_length, character_dict_path, use_space_char)
# print(kwargs) # {'lower': True, 'use_gpu': True}
# self.lower = kwargs['lower']
self.lower = True
"""
牛的
guoyunsheng1999 @.***
------------------ 原始邮件 ------------------ 发件人: @.***>; 发送时间: 2022年8月24日(星期三) 上午9:34 收件人: "PaddlePaddle/PaddleOCR"; 抄送: "Mention"; 主题: Re: [PaddlePaddle/PaddleOCR] 在ICDAR2015上测试,训练准确率高,val特别低 (Issue #5443)
没 后来自己写了一套 没用paddle
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.
train的准确率很高,但是val的非常低,数据集是ICDAR2015
训练命令
python3 -m paddle.distributed.launch --log_dir=./debug/ --gpus '0' tools/train.py -c /home/cat/Desktop/PaddleOCR-release-2.4/configs/rec/rec_icdar15_train.yml