ultralytics / ultralytics

NEW - YOLOv8 🚀 in PyTorch > ONNX > OpenVINO > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
28.24k stars 5.61k forks source link

Question about the resume parameter #2329

Closed 7288Fzq closed 1 year ago

7288Fzq commented 1 year ago

Search before asking

Question

Hello developers! I have a question about the resume parameter and would like to ask you guys. Here is the scenario of my problem: I used 1000 photos for 100epoch training, the training ended successfully, and I got the best and last pt files. If I want to add new data to the current data set, can I directly add the data and labels to the current data set and start the resume parameter directly, or do I have to retrain the mixed data set. We are expecting a reply!

Additional

No response

github-actions[bot] commented 1 year ago

👋 Hello @Fzq15915707288, thank you for your interest in YOLOv8 🚀! We recommend a visit to the YOLOv8 Docs for new users where you can find many Python and CLI usage examples and where many of the most common questions may already be answered.

If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.

Install

Pip install the ultralytics package including all requirements in a Python>=3.7 environment with PyTorch>=1.7.

pip install ultralytics

Environments

YOLOv8 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

Ultralytics CI

If this badge is green, all Ultralytics CI tests are currently passing. CI tests verify correct operation of all YOLOv8 Modes and Tasks on macOS, Windows, and Ubuntu every 24 hours and on every commit.

7288Fzq commented 1 year ago

My current approach is to cover all data sets and use new data sets for training. I just learned from issues that this will forget the previous training results. So I want to know how to add new data and train on the basis of the existing data, do I need to use the resume parameter, do I need to use the last.pt after the training is completed for new training!

7288Fzq commented 1 year ago

I also thought of a question about the difference between resume and pre-trained. If I got last.pt in the previous training of 500 photos, can I discard these 500 old photos and completely use the new photos as the data set in the next new training, and use last.pt as The pre-trained parameters start a new round of training.

7288Fzq commented 1 year ago

I feel that my problem is summed up as one: how to add new data. There are two situations: 1. The labels have not changed, and only new photos have been added. 2. The labels have changed. In my opinion, the only solution to the second point is retraining. I want to know if this cognition is correct, and how to operate in the first situation

glenn-jocher commented 1 year ago

Hi @Fzq15915707288, thank you for reaching out to us 🚀!

To add new data without changing the training labels, you can use the resume parameter with a last.pt file generated from the previous training. Set the resume parameter to the path of the last.pt file and add your new images to the training dataset. When you start training, the model will continue learning from where it left off and include the new data.

However, if the labels have changed, it is generally best to retrain the YOLOv8 model from scratch. Adding new data with different labels can cause confusion and result in lower accuracy due to the semantic gap between the classes.

In summary, if the training labels have not changed, you can use the resume parameter to add new data. If the labels have changed, it is recommended to retrain the model from the beginning to obtain the best result.

I hope this helps. Let me know if you have any other questions.

7288Fzq commented 1 year ago

嗨@Fzq15915707288,感谢您联系我们🚀!

要在不更改训练标签的情况下添加新数据,您可以将resume参数与last.pt先前训练生成的文件一起使用。将参数设置resume为文件的路径last.pt,并将新图像添加到训练数据集中。当您开始训练时,模型将继续从它停止的地方学习并包含新数据。

但是,如果标签发生了变化,通常最好从头开始重新训练 YOLOv8 模型。由于类之间的语义差距,添加具有不同标签的新数据可能会导致混淆并导致准确性降低。

综上所述,如果训练标签没有变化,可以使用resume参数添加新数据。如果标签发生变化,建议从头重新训练模型以获得最佳结果。

我希望这有帮助。如果您有任何其他问题,请告诉我。

i understand! thank you so much !

glenn-jocher commented 1 year ago

You're welcome, @7288Fzq! We're glad to help 🚀. To summarize, if you want to add new data without changing the training labels, you can use the resume parameter with the last.pt file. Otherwise, if the training labels have changed, it is generally best to retrain the model from scratch to avoid confusion and lower accuracy. Let us know if you have any further questions.