PaddlePaddle / PaddleSeg

Easy-to-use image segmentation library with awesome pre-trained model zoo, supporting wide-range of practical tasks in Semantic Segmentation, Interactive Segmentation, Panoptic Segmentation, Image Matting, 3D Segmentation, etc.
https://arxiv.org/abs/2101.06175
Apache License 2.0
8.69k stars 1.68k forks source link

训练过程中评估,直接退出 #3678

Closed seastronger closed 1 day ago

seastronger commented 7 months ago

问题确认 Search before asking

Bug描述 Describe the Bug

2024-03-26 08:18:24 [INFO] [TRAIN] epoch: 1, iter: 10/100, loss: 1.6568, lr: 0.009186, batch_cost: 0.2519, reader_cost: 0.00900, ips: 15.8788 samples/sec | ETA 00:00:22 2024-03-26 08:18:25 [INFO] [TRAIN] epoch: 1, iter: 20/100, loss: 0.5181, lr: 0.008272, batch_cost: 0.1580, reader_cost: 0.03392, ips: 25.3199 samples/sec | ETA 00:00:12 2024-03-26 08:18:27 [INFO] [TRAIN] epoch: 1, iter: 30/100, loss: 0.2872, lr: 0.007347, batch_cost: 0.1593, reader_cost: 0.03512, ips: 25.1148 samples/sec | ETA 00:00:11 2024-03-26 08:18:29 [INFO] [TRAIN] epoch: 1, iter: 40/100, loss: 0.1943, lr: 0.006409, batch_cost: 0.1571, reader_cost: 0.03307, ips: 25.4556 samples/sec | ETA 00:00:09 2024-03-26 08:18:30 [INFO] [TRAIN] epoch: 1, iter: 50/100, loss: 0.2356, lr: 0.005455, batch_cost: 0.1564, reader_cost: 0.03290, ips: 25.5746 samples/sec | ETA 00:00:07 ‘’‘ 2024-03-26 08:18:30 [INFO] Start evaluating (total_samples: 76, total_iters: 76)... 76/76 [==============================] - 3s 34ms/step - batch_cost: 0.0336 - reader cost: 3.1583e-04’‘’ 我训练100次,每隔50次进行一次保存,这到了50次后直接退出了

(paddle36) D:\BaiduNetdiskDownload\code\PaddleSeg-release-2.8>

复现环境 Environment

python tools/train.py --config configs/quick_start/pp_liteseg_optic_disc_512x512_1k.yml --save_interval 50 --do_eval --use_vdl --save_dir output 这是我的训练命令,用的是官网的例子和数据集。

------------Environment Information------------- platform: Windows-10-10.0.19041-SP0 Python: 3.6.13 |Anaconda, Inc.| (default, Mar 16 2021, 11:37:27) [MSC v.1916 64 bit (AMD64)] Paddle compiled with cuda: True NVCC: Build cuda_11.2.r11.2/compiler.29373293_0 cudnn: 8.1 GPUs used: 1 CUDA_VISIBLE_DEVICES: 0 GPU: ['GPU 0: NVIDIA GeForce'] PaddleSeg: 2.8.0 PaddlePaddle: 2.4.2 OpenCV: 4.5.5 这是我的环境

Bug描述确认 Bug description confirmation

是否愿意提交PR? Are you willing to submit a PR?

TachibanaYoshino commented 3 months ago

This problem should occur when running on Windows. It may not happen on Linux.

wjlhahaha commented 1 month ago

救命啊我也是这个问题,还没训练完就结束训练了,根本进行不到下一个epoch,我也没有使用提早结束训练的函数,请问您解决这个问题了吗

TingquanGao commented 1 day ago

Thanks for this issue. As it has been inactive for a long time, we would close it. If you has any questions, please feel free to reopen or new issue, and we will follow up and resolve it.