Training with my own dataset appear error:
2024-04-16 18:56:22 - DEBUG - Training epoch 0 with 0 samples
File "/home/hyq/anaconda3/envs/cvnets/bin/cvnets-train", line 8, in
sys.exit(main_worker())
File "/home/hyq/文档/ml-cvnets/main_train.py", line 235, in main_worker
main(opts=opts, *kwargs)
File "/home/hyq/anaconda3/envs/cvnets/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 346, in wrapper
return f(args, **kwargs)
File "/home/hyq/文档/ml-cvnets/main_train.py", line 174, in main
training_engine.run(train_sampler=train_sampler)
File "/home/hyq/文档/ml-cvnets/engine/training_engine.py", line 606, in run
train_loss, train_ckpt_metric = self.train_epoch(epoch)
File "/home/hyq/文档/ml-cvnets/engine/training_engine.py", line 357, in train_epoch
avg_loss = train_stats.avg_statistics(
File "/home/hyq/文档/ml-cvnets/metrics/stats.py", line 148, in avg_statistics
logger.error(
File "/home/hyq/文档/ml-cvnets/utils/logger.py", line 46, in error
traceback.print_stack()
2024-04-16 18:56:22 - LOGS - Training took 00:00:02.11
2024-04-16 18:56:22 - ERROR - total_loss not present in the dictionary. Available keys are: []. Exiting!!!
train to use:cvnets-train --common.config-file /home/hyq/下载/pspnet-mobilevitv2-1.0.yaml --common.results-loc segmentation_results
Training with my own dataset appear error: 2024-04-16 18:56:22 - DEBUG - Training epoch 0 with 0 samples File "/home/hyq/anaconda3/envs/cvnets/bin/cvnets-train", line 8, in
sys.exit(main_worker())
File "/home/hyq/文档/ml-cvnets/main_train.py", line 235, in main_worker
main(opts=opts, *kwargs)
File "/home/hyq/anaconda3/envs/cvnets/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 346, in wrapper
return f(args, **kwargs)
File "/home/hyq/文档/ml-cvnets/main_train.py", line 174, in main
training_engine.run(train_sampler=train_sampler)
File "/home/hyq/文档/ml-cvnets/engine/training_engine.py", line 606, in run
train_loss, train_ckpt_metric = self.train_epoch(epoch)
File "/home/hyq/文档/ml-cvnets/engine/training_engine.py", line 357, in train_epoch
avg_loss = train_stats.avg_statistics(
File "/home/hyq/文档/ml-cvnets/metrics/stats.py", line 148, in avg_statistics
logger.error(
File "/home/hyq/文档/ml-cvnets/utils/logger.py", line 46, in error
traceback.print_stack()
2024-04-16 18:56:22 - LOGS - Training took 00:00:02.11
2024-04-16 18:56:22 - ERROR - total_loss not present in the dictionary. Available keys are: []. Exiting!!!
train to use:cvnets-train --common.config-file /home/hyq/下载/pspnet-mobilevitv2-1.0.yaml --common.results-loc segmentation_results
pspnet-mobilevitv2-1.0.yaml: common: run_label: "run_1" accum_freq: 1 accum_after_epoch: -1 log_freq: 200 auto_resume: false mixed_precision: true grad_clip: 10.0 dataset: root_train: "/media/hyq/西部数据2TB/ml-cvnets_data/" root_val: "/media/hyq/西部数据2TB/ml-cvnets_data/" name: "ade20k1" category: "segmentation" train_batch_size0: 4 # effective batch size is 16 ( 4 * 4 GPUs) val_batch_size0: 4 eval_batch_size0: 1 workers: 4 persistent_workers: false pin_memory: false image_augmentation: random_crop: enable: true seg_class_max_ratio: 0.75 pad_if_needed: true mask_fill: 0 # background idx is 0 random_horizontal_flip: enable: true resize: enable: true size: [512, 512] interpolation: "bicubic" random_short_size_resize: enable: true interpolation: "bicubic" short_side_min: 256 short_side_max: 768 max_img_dim: 1024 photo_metric_distort: enable: true random_rotate: enable: true angle: 10 mask_fill: 0 # background idx is 0 random_gaussian_noise: enable: true sampler: name: "batch_sampler" bs: crop_size_width: 512 crop_size_height: 512 loss: category: "segmentation" ignore_idx: -1 segmentation: name: "cross_entropy" cross_entropy: aux_weight: 0.4 optim: name: "sgd" weight_decay: 1.e-4 no_decay_bn_filter_bias: true sgd: momentum: 0.9 scheduler: name: "cosine" is_iteration_based: false max_epochs: 120 cosine: max_lr: 0.02 min_lr: 0.0002 model: segmentation: name: "encoder_decoder" lr_multiplier: 1 seg_head: "pspnet" output_stride: 8 use_aux_head: true activation: name: "relu" pspnet: psp_dropout: 0.1 psp_out_channels: 512 psp_pool_sizes: [ 1, 2, 3, 6 ] classification: name: "mobilevit_v2" mitv2: width_multiplier: 1.0 attn_norm_layer: "layer_norm_2d" activation: name: "swish" normalization: name: "sync_batch_norm" momentum: 0.1 activation: name: "swish" inplace: false layer: global_pool: "mean" conv_init: "kaiming_uniform" linear_init: "normal" ema: enable: true momentum: 0.0005 stats: val: [ "loss", "iou" ] train: [ "loss", "grad_norm" ] checkpoint_metric: "iou" checkpoint_metric_max: true