Open yaphet266 opened 8 months ago
是否有文档解释下这个yaml文件中每个配置项的含义
Global: checkpoints: null pretrained_model: null output_dir: ./output device: gpu save_interval: 1 eval_during_train: True eval_interval: 1 epochs: 100 print_batch_step: 20 use_visualdl: False eval_mode: retrieval retrieval_feature_from: features # 'backbone' or 'features' re_ranking: False use_dali: False
image_shape: [3, 224, 224] save_inference_dir: ./inference
AMP: scale_loss: 65536 use_dynamic_loss_scaling: True
level: O1
Arch: name: RecModel infer_output_key: features infer_add_softmax: False
Backbone: name: PPLCNetV2_base_ShiTu pretrained: True use_ssld: True class_expand: &feat_dim 512 BackboneStopLayer: name: flatten Neck: name: BNNeck num_features: feat_dim weight_attr: initializer: name: Constant value: 1.0 bias_attr: initializer: name: Constant value: 0.0 learning_rate: 1.0e-20 # NOTE: Temporarily set lr small enough to freeze the bias to zero Head: name: FC embedding_size: feat_dim class_num: 192612 weight_attr: initializer: name: Normal std: 0.001 bias_attr: False
Loss: Train:
Optimizer: name: Momentum momentum: 0.9 lr: name: Cosine learning_rate: 0.06 # for 8gpu x 256bs warmup_epoch: 5 regularizer: name: L2 coeff: 0.00001
DataLoader: Train: dataset: name: ImageNetDataset image_root: ./dataset/ cls_label_path: ./dataset/train_reg_all_data_v2.txt relabel: True transform_ops:
NormalizeImage: scale: 1.0/255.0 mean: [0.485, 0.456, 0.406] std: [0.229, 0.224, 0.225] order: hwc sampler: name: PKSampler batch_size: 256 sample_per_id: 4 drop_last: False shuffle: True sample_method: "id_avg_prob" id_list: [50030, 80700, 92019, 96015] # be careful when set relabel=True ratio: [4, 4] loader: num_workers: 4 use_shared_memory: True
Eval: Query: dataset: name: VeriWild image_root: ./dataset/Aliproduct/ cls_label_path: ./dataset/Aliproduct/val_list.txt transform_ops:
Gallery: dataset: name: VeriWild image_root: ./dataset/Aliproduct/ cls_label_path: ./dataset/Aliproduct/val_list.txt transform_ops:
Metric: Eval:
修改数据集路径(image_root: ./dataset/ cls_label_path: ./dataset/train_reg_all_data_v2.txt)后可以先训练试试,观察loss是否下降,已经最终的收敛情况、精度情况,再适当调整learning rate。
@TingquanGao 如果主干网络想改成ResNet50怎么修改,是否有可参考的yaml文件
更改一下Arch:即可
是否有文档解释下这个yaml文件中每个配置项的含义
global configs
Global: checkpoints: null pretrained_model: null output_dir: ./output device: gpu save_interval: 1 eval_during_train: True eval_interval: 1 epochs: 100 print_batch_step: 20 use_visualdl: False eval_mode: retrieval retrieval_feature_from: features # 'backbone' or 'features' re_ranking: False use_dali: False
used for static mode and model export
image_shape: [3, 224, 224] save_inference_dir: ./inference
AMP: scale_loss: 65536 use_dynamic_loss_scaling: True
O1: mixed fp16
level: O1
model architecture
Arch: name: RecModel infer_output_key: features infer_add_softmax: False
Backbone: name: PPLCNetV2_base_ShiTu pretrained: True use_ssld: True class_expand: &feat_dim 512 BackboneStopLayer: name: flatten Neck: name: BNNeck num_features: feat_dim weight_attr: initializer: name: Constant value: 1.0 bias_attr: initializer: name: Constant value: 0.0 learning_rate: 1.0e-20 # NOTE: Temporarily set lr small enough to freeze the bias to zero Head: name: FC embedding_size: feat_dim class_num: 192612 weight_attr: initializer: name: Normal std: 0.001 bias_attr: False
loss function config for traing/eval process
Loss: Train:
Optimizer: name: Momentum momentum: 0.9 lr: name: Cosine learning_rate: 0.06 # for 8gpu x 256bs warmup_epoch: 5 regularizer: name: L2 coeff: 0.00001
data loader for train and eval
DataLoader: Train: dataset: name: ImageNetDataset image_root: ./dataset/ cls_label_path: ./dataset/train_reg_all_data_v2.txt relabel: True transform_ops:
NormalizeImage: scale: 1.0/255.0 mean: [0.485, 0.456, 0.406] std: [0.229, 0.224, 0.225] order: hwc sampler: name: PKSampler batch_size: 256 sample_per_id: 4 drop_last: False shuffle: True sample_method: "id_avg_prob" id_list: [50030, 80700, 92019, 96015] # be careful when set relabel=True ratio: [4, 4] loader: num_workers: 4 use_shared_memory: True
Eval: Query: dataset: name: VeriWild image_root: ./dataset/Aliproduct/ cls_label_path: ./dataset/Aliproduct/val_list.txt transform_ops:
Gallery: dataset: name: VeriWild image_root: ./dataset/Aliproduct/ cls_label_path: ./dataset/Aliproduct/val_list.txt transform_ops:
Metric: Eval: