Open williamhyin opened 3 years ago
hmm... it just because modify the code of dataloader to change data augmentation policy in the training process need additional effort. if you could, you can stop second training step when 424th epoch finish training.
hmm... it just because modify the code of dataloader to change data augmentation policy in the training process need additional effort. if you could, you can stop second training step when 424th epoch finish training.
Thanks, maybe a training tutorial is much suitable for understanding your idea!
Hi,
I am confused about your training schedule logic in branch "paper".
`python -m torch.distributed.launch --nproc_per_node 8 --master_port 9527 train.py --batch-size 64 --img 1280 1280 --data data/coco.yaml --cfg models/yolor-p6.yaml --weights '' --sync-bn --device 0,1,2,3,4,5,6,7 --name yolor-p6 --hyp hyp.scratch.1280.yaml --epochs 300
python -m torch.distributed.launch --nproc_per_node 8 --master_port 9527 tune.py --batch-size 64 --img 1280 1280 --data data/coco.yaml --cfg models/yolor-p6.yaml --weights 'runs/train/yolor-p6/weights/last_298.pt' --sync-bn --device 0,1,2,3,4,5,6,7 --name yolor-p6-tune --hyp hyp.finetune.1280.yaml --epochs 450
python -m torch.distributed.launch --nproc_per_node 8 --master_port 9527 train.py --batch-size 64 --img 1280 1280 --data data/coco.yaml --cfg models/yolor-p6.yaml --weights 'runs/train/yolor-p6-tune/weights/epoch_424.pt' --sync-bn --device 0,1,2,3,4,5,6,7 --name yolor-p6-fine --hyp hyp.finetune.1280.yaml --epochs 450`
In third step, why you choose the epoch_424, and train only bis 450 epochs(only train remain 26 epochs )? why not chose the best finetung epoch?