Closed wcpcp closed 1 month ago
2024-05-22 03:30:39,391 [INFO] parse_args:
2024-05-22 03:30:39,391 [INFO] task segment
2024-05-22 03:30:39,391 [INFO] device_target GPU
2024-05-22 03:30:39,391 [INFO] save_dir ./runs/2024.05.22-03.30.39
2024-05-22 03:30:39,391 [INFO] log_level INFO
2024-05-22 03:30:39,391 [INFO] is_parallel False
2024-05-22 03:30:39,391 [INFO] ms_mode 0
2024-05-22 03:30:39,391 [INFO] ms_amp_level O0
2024-05-22 03:30:39,391 [INFO] keep_loss_fp32 True
2024-05-22 03:30:39,391 [INFO] ms_loss_scaler static
2024-05-22 03:30:39,391 [INFO] ms_loss_scaler_value 1024.0
2024-05-22 03:30:39,391 [INFO] ms_jit True
2024-05-22 03:30:39,391 [INFO] ms_enable_graph_kernel False
2024-05-22 03:30:39,391 [INFO] ms_datasink False
2024-05-22 03:30:39,391 [INFO] overflow_still_update True
2024-05-22 03:30:39,391 [INFO] clip_grad True
2024-05-22 03:30:39,391 [INFO] clip_grad_value 10.0
2024-05-22 03:30:39,391 [INFO] ema True
2024-05-22 03:30:39,391 [INFO] weight
2024-05-22 03:30:39,391 [INFO] ema_weight
2024-05-22 03:30:39,391 [INFO] freeze []
2024-05-22 03:30:39,391 [INFO] epochs 300
2024-05-22 03:30:39,391 [INFO] per_batch_size 16
2024-05-22 03:30:39,391 [INFO] img_size 640
2024-05-22 03:30:39,391 [INFO] nbs 64
2024-05-22 03:30:39,391 [INFO] accumulate 1
2024-05-22 03:30:39,391 [INFO] auto_accumulate False
2024-05-22 03:30:39,391 [INFO] log_interval 100
2024-05-22 03:30:39,391 [INFO] single_cls False
2024-05-22 03:30:39,391 [INFO] sync_bn False
2024-05-22 03:30:39,391 [INFO] keep_checkpoint_max 100
2024-05-22 03:30:39,391 [INFO] run_eval False
2024-05-22 03:30:39,391 [INFO] conf_thres 0.001
2024-05-22 03:30:39,391 [INFO] iou_thres 0.7
2024-05-22 03:30:39,391 [INFO] conf_free True
2024-05-22 03:30:39,391 [INFO] rect False
2024-05-22 03:30:39,391 [INFO] nms_time_limit 20.0
2024-05-22 03:30:39,391 [INFO] recompute True
2024-05-22 03:30:39,391 [INFO] recompute_layers 2
2024-05-22 03:30:39,391 [INFO] seed 2
2024-05-22 03:30:39,391 [INFO] summary True
2024-05-22 03:30:39,391 [INFO] profiler False
2024-05-22 03:30:39,391 [INFO] profiler_step_num 1
2024-05-22 03:30:39,391 [INFO] opencv_threads_num 0
2024-05-22 03:30:39,391 [INFO] strict_load True
2024-05-22 03:30:39,391 [INFO] enable_modelarts False
2024-05-22 03:30:39,391 [INFO] data_url
2024-05-22 03:30:39,391 [INFO] ckpt_url
2024-05-22 03:30:39,391 [INFO] multi_data_url
2024-05-22 03:30:39,391 [INFO] pretrain_url
2024-05-22 03:30:39,391 [INFO] train_url
2024-05-22 03:30:39,391 [INFO] data_dir /split
2024-05-22 03:30:39,391 [INFO] ckpt_dir /cache/pretrain_ckpt/
2024-05-22 03:30:39,391 [INFO] data.dataset_name coco
2024-05-22 03:30:39,391 [INFO] data.train_set /split/images/train
2024-05-22 03:30:39,391 [INFO] data.val_set /split/images/val
2024-05-22 03:30:39,391 [INFO] data.test_set /split/images/test
2024-05-22 03:30:39,391 [INFO] data.nc 1
2024-05-22 03:30:39,391 [INFO] data.names ['dm']
2024-05-22 03:30:39,391 [INFO] train_transforms.stage_epochs [300]
2024-05-22 03:30:39,391 [INFO] train_transforms.trans_list [[{'func_name': 'resample_segments'}, {'func_name': 'letterbox', 'scaleup': True}, {'func_name': 'hsv_augment', 'prob': 1.0, 'hgain': 0.015, 'sgain': 0.7, 'vgain': 0.4}, {'func_name': 'fliplr', 'prob': 0.5}, {'func_name': 'segment_poly2mask', 'mask_overlap': True, 'mask_ratio': 4}, {'func_name': 'labelnorm', 'xyxy2xywh': True}, {'func_name': 'label_pad', 'padding_size': 160, 'padding_value': -1}, {'func_name': 'image_norm', 'scale': 255.0}, {'func_name': 'image_transpose', 'bgr2rgb': True, 'hwc2chw': True}]]
2024-05-22 03:30:39,391 [INFO] data.test_transforms [{'func_name': 'letterbox', 'scaleup': False}, {'func_name': 'image_norm', 'scale': 255.0}, {'func_name': 'image_transpose', 'bgr2rgb': True, 'hwc2chw': True}]
2024-05-22 03:30:39,391 [INFO] data.num_parallel_workers 4
2024-05-22 03:30:39,391 [INFO] network.model_name yolov8
2024-05-22 03:30:39,391 [INFO] network.nc 1
2024-05-22 03:30:39,391 [INFO] network.reg_max 16
2024-05-22 03:30:39,391 [INFO] network.stride [8, 16, 32]
2024-05-22 03:30:39,391 [INFO] network.backbone [[-1, 1, 'ConvNormAct', [64, 3, 2]], [-1, 1, 'ConvNormAct', [128, 3, 2]], [-1, 3, 'C2f', [128, True]], [-1, 1, 'ConvNormAct', [256, 3, 2]], [-1, 6, 'C2f', [256, True]], [-1, 1, 'ConvNormAct', [512, 3, 2]], [-1, 6, 'C2f', [512, True]], [-1, 1, 'ConvNormAct', [1024, 3, 2]], [-1, 3, 'C2f', [1024, True]], [-1, 1, 'SPPF', [1024, 5]]]
2024-05-22 03:30:39,391 [INFO] network.head [[-1, 1, 'Upsample', ['None', 2, 'nearest']], [[-1, 6], 1, 'Concat', [1]], [-1, 3, 'C2f', [512]], [-1, 1, 'Upsample', ['None', 2, 'nearest']], [[-1, 4], 1, 'Concat', [1]], [-1, 3, 'C2f', [256]], [-1, 1, 'ConvNormAct', [256, 3, 2]], [[-1, 12], 1, 'Concat', [1]], [-1, 3, 'C2f', [512]], [-1, 1, 'ConvNormAct', [512, 3, 2]], [[-1, 9], 1, 'Concat', [1]], [-1, 3, 'C2f', [1024]], [[15, 18, 21], 1, 'YOLOv8SegHead', ['nc', 'reg_max', 32, 256, 'stride']]]
2024-05-22 03:30:39,391 [INFO] network.depth_multiple 1.0
2024-05-22 03:30:39,391 [INFO] network.width_multiple 1.25
2024-05-22 03:30:39,391 [INFO] network.max_channels 512
2024-05-22 03:30:39,391 [INFO] optimizer.optimizer momentum
2024-05-22 03:30:39,391 [INFO] optimizer.lr_init 0.01
2024-05-22 03:30:39,391 [INFO] optimizer.momentum 0.937
2024-05-22 03:30:39,391 [INFO] optimizer.nesterov True
2024-05-22 03:30:39,391 [INFO] optimizer.loss_scale 1.0
2024-05-22 03:30:39,391 [INFO] optimizer.warmup_epochs 3
2024-05-22 03:30:39,391 [INFO] optimizer.warmup_momentum 0.8
2024-05-22 03:30:39,391 [INFO] optimizer.warmup_bias_lr 0.1
2024-05-22 03:30:39,391 [INFO] optimizer.min_warmup_step 1000
2024-05-22 03:30:39,391 [INFO] optimizer.group_param yolov8
2024-05-22 03:30:39,391 [INFO] optimizer.gp_weight_decay 0.0010078125
2024-05-22 03:30:39,391 [INFO] optimizer.start_factor 1.0
2024-05-22 03:30:39,391 [INFO] optimizer.end_factor 0.01
2024-05-22 03:30:39,391 [INFO] optimizer.epochs 300
2024-05-22 03:30:39,391 [INFO] optimizer.nbs 64
2024-05-22 03:30:39,391 [INFO] optimizer.accumulate 1
2024-05-22 03:30:39,391 [INFO] optimizer.total_batch_size 16
2024-05-22 03:30:39,391 [INFO] loss.name YOLOv8SegLoss
2024-05-22 03:30:39,391 [INFO] loss.box 7.5
2024-05-22 03:30:39,391 [INFO] loss.cls 0.5
2024-05-22 03:30:39,391 [INFO] loss.dfl 1.5
2024-05-22 03:30:39,391 [INFO] loss.reg_max 16
2024-05-22 03:30:39,391 [INFO] loss.nm 32
2024-05-22 03:30:39,391 [INFO] loss.overlap True
2024-05-22 03:30:39,391 [INFO] loss.max_object_num 600
2024-05-22 03:30:39,391 [INFO] config ./configs/yolov8/seg/yolov8x-seg.yaml
2024-05-22 03:30:39,391 [INFO] rank 0
2024-05-22 03:30:39,391 [INFO] rank_size 1
2024-05-22 03:30:39,391 [INFO] total_batch_size 16
2024-05-22 03:30:39,391 [INFO] callback []
2024-05-22 03:30:39,391 [INFO]
2024-05-22 03:30:39,393 [INFO] Please check the above information for the configurations
2024-05-22 03:30:39,832 [WARNING] Parse Model, args: nearest, keep str type
2024-05-22 03:30:39,893 [WARNING] Parse Model, args: nearest, keep str type
2024-05-22 03:30:40,340 [INFO] number of network params, total: 71.812198M, trainable: 71.751795M
2024-05-22 03:30:40,348 [INFO] Turn on recompute, and the results of the first 2 layers will be recomputed.
2024-05-22 03:30:45,855 [WARNING] Parse Model, args: nearest, keep str type
2024-05-22 03:30:45,946 [WARNING] Parse Model, args: nearest, keep str type
2024-05-22 03:30:46,698 [INFO] number of network params, total: 71.812198M, trainable: 71.751795M
2024-05-22 03:30:46,706 [INFO] Turn on recompute, and the results of the first 2 layers will be recomputed.
2024-05-22 03:30:52,107 [INFO] ema_weight not exist, default pretrain weight is currently used.
2024-05-22 03:30:52,414 [INFO] Dataset Cache file hash/version check success.
2024-05-22 03:30:52,414 [INFO] Load dataset cache from [/labels/train.cache.npy] success.
Scanning '/split/labels/train.cache.npy' images and labels... 1148 found, 0 missing, 1 empty, 0 corrup
2024-05-22 03:30:52,422 [INFO] Dataloader num parallel workers: [4]
2024-05-22 03:30:54,685 [INFO] Registry(name=callback, total=4)
2024-05-22 03:30:54,685 [INFO] (0): YoloxSwitchTrain in mindyolo/utils/callback.py
2024-05-22 03:30:54,685 [INFO] (1): EvalWhileTrain in mindyolo/utils/callback.py
2024-05-22 03:30:54,685 [INFO] (2): SummaryCallback in mindyolo/utils/callback.py
2024-05-22 03:30:54,685 [INFO] (3): ProfilerCallback in mindyolo/utils/callback.py
2024-05-22 03:30:54,685 [INFO]
2024-05-22 03:30:55,212 [INFO] got 1 active callback as follows:
2024-05-22 03:30:55,214 [INFO] SummaryCallback()
感谢您的反馈,mindyolo目前仅在Ascend做了验证,还没有在GPU/CPU验证过,如验证后,我们会在issue中更新通知。
使用华为910B,从coco切换到自有数据集,也出现了上述情况
@Hucley 感谢反馈,可以的话请提供一下MindSpore版本信息,MindYolo版本信息。
使用GPU训练yolov8-seg,环境为mindspore2.0,Loss不收敛。我已经检查了输入和标签是正常的,但是Loss不收敛。