Closed rexzhengzhihong closed 8 months ago
一般 960 max 就可以了
960 max 的话我大部分的图片都是五六百的,识别率有点低
960 max 是超过960才会缩放的,低于960应该没问题的。
那可能我训练的参数有问题?
Global:
use_gpu: true
epoch_num: 500
log_smooth_window: 20
print_batch_step: 2
save_model_dir: /home/DiskA/zncsPython/picture_ocr/zzszyfp_v1/model/det/output/ch_db_mv3/
save_epoch_step: 1200
eval_batch_step:
- 0
- 20
cal_metric_during_train: false
pretrained_model: ./pretrain_models/MobileNetV3_large_x0_5_pretrained
checkpoints: null
save_inference_dir: null
use_visualdl: false
infer_img: doc/imgs_en/img_10.jpg
save_res_path: ./output/det_db/predicts_db.txt
Architecture:
name: DistillationModel
algorithm: Distillation
model_type: det
Models:
Student:
return_all_feats: false
model_type: det
algorithm: DB
Backbone:
name: ResNet_vd
in_channels: 3
layers: 50
Neck:
name: LKPAN
out_channels: 256
Head:
name: DBHead
kernel_list:
- 7
- 2
- 2
k: 50
pretrained: ./pretrain_models/ResNet50_vd_ssld_pretrained
Student2:
return_all_feats: false
model_type: det
algorithm: DB
Backbone:
name: ResNet_vd
in_channels: 3
layers: 50
Neck:
name: LKPAN
out_channels: 256
Head:
name: DBHead
kernel_list:
- 7
- 2
- 2
k: 50
pretrained: ./pretrain_models/ResNet50_vd_ssld_pretrained
Loss:
name: CombinedLoss
loss_config_list:
- DistillationDMLLoss:
model_name_pairs:
- Student
- Student2
maps_name: thrink_maps
weight: 1.0
key: maps
- DistillationDBLoss:
weight: 1.0
model_name_list:
- Student
- Student2
name: DBLoss
balance_loss: true
main_loss_type: DiceLoss
alpha: 5
beta: 10
ohem_ratio: 3
Optimizer:
name: Adam
beta1: 0.9
beta2: 0.999
lr:
name: Cosine
learning_rate: 0.001
warmup_epoch: 2
regularizer:
name: L2
factor: 0
PostProcess:
name: DistillationDBPostProcess
model_name:
- Student
- Student2
key: head_out
thresh: 0.3
box_thresh: 0.7
max_candidates: 1000
unclip_ratio: 1.5
Metric:
name: DistillationMetric
base_metric_name: DetMetric
main_indicator: hmean
key: Student
Train:
dataset:
name: SimpleDataSet
data_dir: /home/DiskA/zncsPython/picture_ocr/zzszyfp_v1/split_data/det
label_file_list:
- /home/DiskA/zncsPython/picture_ocr/zzszyfp_v1/split_data/det/train.txt
ratio_list:
- 1.0
transforms:
- DecodeImage:
img_mode: BGR
channel_first: false
- DetLabelEncode: null
- CopyPaste: null
- IaaAugment:
augmenter_args:
- type: Fliplr
args:
p: 0.5
- type: Affine
args:
rotate:
- -10
- 10
- type: Resize
args:
size:
- 0.5
- 3
- EastRandomCropData:
size:
- 960
- 960
max_tries: 50
keep_ratio: true
- MakeBorderMap:
shrink_ratio: 0.4
thresh_min: 0.3
thresh_max: 0.7
- MakeShrinkMap:
shrink_ratio: 0.4
min_text_size: 8
- NormalizeImage:
scale: 1./255.
mean:
- 0.485
- 0.456
- 0.406
std:
- 0.229
- 0.224
- 0.225
order: hwc
- ToCHWImage: null
- KeepKeys:
keep_keys:
- image
- threshold_map
- threshold_mask
- shrink_map
- shrink_mask
loader:
shuffle: true
drop_last: false
batch_size_per_card: 2
num_workers: 4
Eval:
dataset:
name: SimpleDataSet
data_dir: /home/DiskA/zncsPython/picture_ocr/zzszyfp_v1/split_data/det
label_file_list:
- /home/DiskA/zncsPython/picture_ocr/zzszyfp_v1/split_data/det/val.txt
transforms:
- DecodeImage:
img_mode: BGR
channel_first: false
- DetLabelEncode: null
- DetResizeForTest: null
- NormalizeImage:
scale: 1./255.
mean:
- 0.485
- 0.456
- 0.406
std:
- 0.229
- 0.224
- 0.225
order: hwc
- ToCHWImage: null
- KeepKeys:
keep_keys:
- image
- shape
- polys
- ignore_tags
loader:
shuffle: false
drop_last: false
batch_size_per_card: 1
num_workers: 2
训练的时候是哪个参数在设置缩放的?
你的数据量有多大
- EastRandomCropData:
size:
- 960
- 960
max_tries: 50
几百张 会计的单据
多加点数据可能会有帮助
设置 960 max 。det检测的效果就比较差。识别的图片大小是700*400左右的。这个有可能是哪里出问题呢?
这个一般不用改
也可以试试换个检测器
感觉不同的图片大小就应该用不同的参数吧?不然都得缩放成固定大小的了
如果推理图片存在很多高分辨率的样本,可以把长边限制增大。 例如:det_limit_side_len=2000,det_limit_type='max'
如果推理图片存在很多高分辨率的样本,可以把长边限制增大。 例如:det_limit_side_len=2000,det_limit_type='max'
好的,我试试
暂时将此问题关闭~ 如有需要可以再次打开。
如果推理图片存在很多高分辨率的样本,可以把长边限制增大。 例如:det_limit_side_len=2000,det_limit_type='max'
这样小分辨率的图片会有问题。小的图片的分辨率是693365。但是我训练的时候EastRandomCropData:size: - 960 - 960。这样好像是放大之后去训练的。当我设置det_limit_type=“max”的时候。意思是图片大于det_limit_side_len才去缩放。这张693365小于det_limit_side_len。在推理的时候,文本检测的准确率就比较低了。 应该要吧这个参数根据实际推理的图片大小去变化,还是说是在推理之前。去判断图片大小,小的图片放大之后再去推理
@tink2123
@rexzhengzhihong 关于不同分辨率的图像训练,请问你解决了吗?效果有没有改善
请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem
如果识别的图片 大小差的有点多。小的 693365 中等的1565 951 大的4028*3120 这样,在预测的时候det_limit_side_len和det_limit_type 怎么设置呢
det_limit_side_len=736.0,det_limit_type='min' 设置这样小的图片预测没问题。但是大的图片准确率比较低。改成det_limit_side_len=960.0, det_limit_type='max',反过来小的图片准确率有点低
请尽量不要包含图片在问题中/Please try to not include the image in the issue.