ultralytics / ultralytics

Ultralytics YOLO11 ๐Ÿš€
https://docs.ultralytics.com
GNU Affero General Public License v3.0
29.56k stars 5.79k forks source link

Is there anyone tried use YOLO segment model to train Standford2D3DS Dataset๏ผŸ #10027

Closed Arno-Tu closed 4 months ago

Arno-Tu commented 5 months ago

Search before asking

Question

I'm currently training the segment on a self-converted dataset of Standford2D3DS using yolov8, but I'm running into some issues. Specifically, I wondered why the seg_loss value remained around 3 during the training process, and the downward trend was very slow. I tried to train with small batches of data, but the results on the data set were very different from the results of the self-taken indoor scene photos. If anyone can give me a little inspiration, I would be very grateful

Additional

No response

github-actions[bot] commented 5 months ago

๐Ÿ‘‹ Hello @Arno-Tu, thank you for your interest in Ultralytics YOLOv8 ๐Ÿš€! We recommend a visit to the Docs for new users where you can find many Python and CLI usage examples and where many of the most common questions may already be answered.

If this is a ๐Ÿ› Bug Report, please provide a minimum reproducible example to help us debug it.

If this is a custom training โ“ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.

Join the vibrant Ultralytics Discord ๐ŸŽง community for real-time conversations and collaborations. This platform offers a perfect space to inquire, showcase your work, and connect with fellow Ultralytics users.

Install

Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.

pip install ultralytics

Environments

YOLOv8 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

Ultralytics CI

If this badge is green, all Ultralytics CI tests are currently passing. CI tests verify correct operation of all YOLOv8 Modes and Tasks on macOS, Windows, and Ubuntu every 24 hours and on every commit.

glenn-jocher commented 5 months ago

Hey there! ๐ŸŒŸ It's great that you're exploring the use of YOLO segment models with the Stanford2D3DS Dataset. Regarding the seg_loss issue you're facing, it might be related to a few factors such as the learning rate, data representation, or the complexity of your dataset vs. the training data.

Assuming you've already tried adjusting batch sizes, another area to look into could be the learning rate. Sometimes starting with a slightly higher learning rate and then reducing it as training progresses can help. For example:

# Example of adjusting learning rates
model.train(
    data='your_dataset.yaml',
    epochs=100,
    lr0=0.01,  # Initial Learning Rate
    lr_lambda=lambda x: 0.95 ** x  # Learning rate schedule
)

Additionally, ensuring your dataset is well-represented, with annotations closely matching the model's expectation, is crucial. Sometimes discrepancies between training data and real-world application scenarios can lead to such issues. You might want to revisit your dataset conversion process to ensure compatibility with YOLOv8's expectations.

Lastly, utilizing data augmentation techniques could also help the model generalize better from the training dataset to real-world photos. YOLOv8 supports various augmentation strategies that could be beneficial.

Feel free to share more details about your setup, and maybe we can pinpoint more specific suggestions. Keep experimenting, and good luck! ๐Ÿš€

Arno-Tu commented 5 months ago

Hey there! ๐ŸŒŸ It's great that you're exploring the use of YOLO segment models with the Stanford2D3DS Dataset. Regarding the seg_loss issue you're facing, it might be related to a few factors such as the learning rate, data representation, or the complexity of your dataset vs. the training data.

Assuming you've already tried adjusting batch sizes, another area to look into could be the learning rate. Sometimes starting with a slightly higher learning rate and then reducing it as training progresses can help. For example:

# Example of adjusting learning rates
model.train(
    data='your_dataset.yaml',
    epochs=100,
    lr0=0.01,  # Initial Learning Rate
    lr_lambda=lambda x: 0.95 ** x  # Learning rate schedule
)

Additionally, ensuring your dataset is well-represented, with annotations closely matching the model's expectation, is crucial. Sometimes discrepancies between training data and real-world application scenarios can lead to such issues. You might want to revisit your dataset conversion process to ensure compatibility with YOLOv8's expectations.

Lastly, utilizing data augmentation techniques could also help the model generalize better from the training dataset to real-world photos. YOLOv8 supports various augmentation strategies that could be beneficial.

Feel free to share more details about your setup, and maybe we can pinpoint more specific suggestions. Keep experimenting, and good luck! ๐Ÿš€

hi Glenn๏ผŒthanks for your response.

In fact, I've tried adjusting the learning rate from the default of 0.01 to 0.001, adjusting the optimizer from SGD to Adam, and re-training on the entire data set. Considering that the categories of Stanford2D3DS are not consistent with the categories of pre-training, I did not add the existing pre-training weights in the subsequent training, but started from scratch.

However, it is obvious that the downward trend of seg_loss is still not obvious. This is the result of my training on the full data set, I'm sorry that there are only 10 epochs, because it takes too long to train one time. train_loss

In addition, I completed a 100epoch training on a small dataset, but the downward trend of this seg_loss is still not obvious train_loss_2

As for the data set label conversion you mentioned, I also thought about that. Considering that the data label format required by YOLO segment models is inconsistent with that of Standford2D3DS, I did a conversion myself. Specifically, I extracted each instance boundary from the semantic image of Standford2D3DS, and used it as class_idx point1_x point1_y.... This format is converted to the corresponding txt tag file. Here's an example:

camera_0a529ee09fc04037831902b83073862b_office_4_frame_22_domain_rgb.txt

camera_0a529ee09fc04037831902b83073862b_office_4_frame_22_domain_rgb

If it's convenient, you can test it with the following script file to recreate the semantic image of each instance from the txt file I provided

ImageRebuilt.txt

I hope the information I provided this time can be helpful to solve the problem without disturbing you

glenn-jocher commented 5 months ago

Hey there! Thanks for providing such detailed information about your process! ๐ŸŒˆ Adjusting the learning rate and exploring different optimizers are great steps. When the seg_loss trend doesn't show clear improvement, it's a challenging situation indeed.

Given that you've started training from scratch due to category inconsistencies and have already undertaken a range of adjustments, here are a couple of quick suggestions:

I appreciate the detailed setup you've shared, and it's clear you've invested a lot of effort into making this work. If after trying out these suggestions the problem still persists, sharing a snippet of your model's definition and the exact training command might provide further insight. We're here to assist and want to see your project succeed! ๐Ÿš€

github-actions[bot] commented 4 months ago

๐Ÿ‘‹ Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO ๐Ÿš€ and Vision AI โญ