ultralytics / ultralytics

Ultralytics YOLO11 πŸš€
https://docs.ultralytics.com
GNU Affero General Public License v3.0
32.59k stars 6.27k forks source link

RTDETR training using OBB #13444

Open shreejalt opened 5 months ago

shreejalt commented 5 months ago

Search before asking

Question

Hi all, I wanted to train RTDETR using COCO dataset using OBB format. Can someone tell me how to do that. Or if it is not included, which files can I change to do the same?

Thanks, Shreejal

Additional

No response

glenn-jocher commented 5 months ago

Hi Shreejal,

Thank you for reaching out! It's great to see your interest in training RTDETR with the COCO dataset using the Oriented Bounding Box (OBB) format. Currently, the Ultralytics repository supports OBB datasets like DOTA-v2 and DOTA8, but you can certainly adapt the COCO dataset for OBB training with some modifications.

Here's a step-by-step guide to help you get started:

  1. Convert COCO Annotations to YOLO OBB Format: You'll need to convert your COCO dataset annotations to the YOLO OBB format. The YOLO OBB format specifies bounding boxes by their four corner points with coordinates normalized between 0 and 1:

    class_index, x1, y1, x2, y2, x3, y3, x4, y4

    You can create a script similar to the convert_dota_to_yolo_obb function to convert COCO annotations to this format. Here's a basic example to get you started:

    import json
    import os
    
    def convert_coco_to_yolo_obb(coco_annotation_path, output_dir):
       with open(coco_annotation_path, 'r') as f:
           coco_data = json.load(f)
    
       for image in coco_data['images']:
           image_id = image['id']
           file_name = image['file_name']
           annotations = [ann for ann in coco_data['annotations'] if ann['image_id'] == image_id]
    
           with open(os.path.join(output_dir, file_name.replace('.jpg', '.txt')), 'w') as out_file:
               for ann in annotations:
                   bbox = ann['bbox']
                   # Convert bbox to OBB format (this is a placeholder, actual conversion logic needed)
                   x1, y1, x2, y2, x3, y3, x4, y4 = convert_bbox_to_obb(bbox)
                   out_file.write(f"{ann['category_id']} {x1} {y1} {x2} {y2} {x3} {y3} {x4} {y4}\n")
    
    def convert_bbox_to_obb(bbox):
       # Placeholder function to convert bbox to OBB format
       # Implement your logic here
       return bbox[0], bbox[1], bbox[0] + bbox[2], bbox[1], bbox[0] + bbox[2], bbox[1] + bbox[3], bbox[0], bbox[1] + bbox[3]
    
    # Example usage
    convert_coco_to_yolo_obb('path/to/coco/annotations.json', 'path/to/output/dir')
  2. Prepare the Dataset Configuration: Create a YAML configuration file for your dataset similar to the DOTA dataset configuration. This file should include paths to your images and the converted OBB annotations.

    train: path/to/train/images
    val: path/to/val/images
    
    nc: 80  # Number of classes in COCO
    names: ['person', 'bicycle', 'car', ...]  # List of class names
  3. Train the Model: With your dataset prepared, you can train the RTDETR model using the following command:

    yolo detect train data=path/to/your_dataset.yaml model=yolov8n-obb.yaml epochs=100 imgsz=640

If you encounter any issues or need further assistance, please ensure you are using the latest versions of torch and ultralytics. If the problem persists, providing a minimum reproducible example will help us investigate further. You can refer to our minimum reproducible example guide for more details.

Feel free to reach out with any additional questions. Happy training! πŸš€

shreejalt commented 5 months ago

Hi @glenn-jocher I want to train RTDETR with OBB. Can I train that model? I don't want to train the yolov8 model.

Thanksm

glenn-jocher commented 5 months ago

Hi @shreejalt,

Thank you for your question! Training RTDETR with Oriented Bounding Boxes (OBB) is indeed possible, although it requires some custom setup since the primary focus has been on YOLO models. Here's how you can proceed:

Steps to Train RTDETR with OBB

  1. Convert Your Dataset to YOLO OBB Format: First, ensure your dataset annotations are in the YOLO OBB format. This format specifies bounding boxes by their four corner points with coordinates normalized between 0 and 1:

    class_index, x1, y1, x2, y2, x3, y3, x4, y4

    If you need to convert your dataset (e.g., COCO) to this format, you can use a script similar to the one below:

    import json
    import os
    
    def convert_coco_to_yolo_obb(coco_annotation_path, output_dir):
       with open(coco_annotation_path, 'r') as f:
           coco_data = json.load(f)
    
       for image in coco_data['images']:
           image_id = image['id']
           file_name = image['file_name']
           annotations = [ann for ann in coco_data['annotations'] if ann['image_id'] == image_id]
    
           with open(os.path.join(output_dir, file_name.replace('.jpg', '.txt')), 'w') as out_file:
               for ann in annotations:
                   bbox = ann['bbox']
                   # Convert bbox to OBB format (this is a placeholder, actual conversion logic needed)
                   x1, y1, x2, y2, x3, y3, x4, y4 = convert_bbox_to_obb(bbox)
                   out_file.write(f"{ann['category_id']} {x1} {y1} {x2} {y2} {x3} {y3} {x4} {y4}\n")
    
    def convert_bbox_to_obb(bbox):
       # Placeholder function to convert bbox to OBB format
       # Implement your logic here
       return bbox[0], bbox[1], bbox[0] + bbox[2], bbox[1], bbox[0] + bbox[2], bbox[1] + bbox[3], bbox[0], bbox[1] + bbox[3]
    
    # Example usage
    convert_coco_to_yolo_obb('path/to/coco/annotations.json', 'path/to/output/dir')
  2. Prepare the Dataset Configuration: Create a YAML configuration file for your dataset, similar to the DOTA dataset configuration. This file should include paths to your images and the converted OBB annotations.

    train: path/to/train/images
    val: path/to/val/images
    
    nc: 80  # Number of classes in COCO
    names: ['person', 'bicycle', 'car', ...]  # List of class names
  3. Train the RTDETR Model: With your dataset prepared, you can train the RTDETR model. Although the primary focus has been on YOLO models, you can adapt the training script for RTDETR. Here's an example command:

    yolo detect train data=path/to/your_dataset.yaml model=path/to/rtdetr_model.yaml epochs=100 imgsz=640

Additional Tips

If you encounter any issues or need further assistance, please provide a minimum reproducible example to help us investigate further. You can refer to our minimum reproducible example guide for more details.

Feel free to reach out with any additional questions. Happy training! πŸš€

shreejalt commented 5 months ago

Hi @glenn-jocher Don't I need to change anything in the rtdetr yaml file? I saw that you have add OBB layer in yolov8-obb.yaml file. I don't see that in the rtdetr-x.yaml file.

DO I need to change the yaml file for this?

Kayzwer commented 5 months ago

Hi @glenn-jocher Don't I need to change anything in the rtdetr yaml file? I saw that you have add OBB layer in yolov8-obb.yaml file. I don't see that in the rtdetr-x.yaml file.

DO I need to change the yaml file for this?

currently there is no obb head combined with transformer

glenn-jocher commented 5 months ago

Hi @Kayzwer,

Great question! You're correct that the yolov8-obb.yaml file includes specific configurations for handling Oriented Bounding Boxes (OBB), and these configurations are not present in the rtdetr-x.yaml file. To train RTDETR with OBB, you will need to modify the RTDETR configuration to include an OBB head.

Here's a step-by-step guide to help you make the necessary changes:

  1. Add OBB Layer to RTDETR Configuration: You'll need to modify the RTDETR YAML file to include an OBB head. This involves adding the appropriate layers and configurations to handle OBB. Below is an example of how you might modify the rtdetr-x.yaml file:

    # rtdetr-x-obb.yaml
    backbone:
     # Backbone configuration
     ...
    
    head:
     type: OBBHead
     num_classes: 80  # Adjust based on your dataset
     ...
    
    # Additional configurations
    ...

    Ensure that the OBBHead is correctly defined and integrated into the model architecture. You may need to refer to the implementation details of the OBB head used in yolov8-obb.yaml and adapt it for RTDETR.

  2. Update Dataset Configuration: Ensure your dataset configuration YAML file is correctly set up for OBB. This includes specifying the paths to your images and annotations in the YOLO OBB format.

  3. Train the Model: With the modified configuration file, you can now train the RTDETR model with OBB support:

    yolo detect train data=path/to/your_dataset.yaml model=path/to/rtdetr-x-obb.yaml epochs=100 imgsz=640
  4. Verify and Debug: Ensure that your modifications are correctly implemented by running a few test epochs and checking the outputs. If you encounter any issues, please provide a minimum reproducible example to help us investigate further. You can refer to our minimum reproducible example guide for more details.

If you need further assistance or run into any issues, feel free to reach out. We're here to help! Happy training! πŸš€

shreejalt commented 5 months ago

Hi @glenn-jocher Don't I need to change anything in the rtdetr yaml file? I saw that you have add OBB layer in yolov8-obb.yaml file. I don't see that in the rtdetr-x.yaml file. DO I need to change the yaml file for this?

currently there is no obb head combined with transformer

Thanks @Kayzwer. So do I need to change the output of the RTDETRDecoder to accompany the OBB or I can directly keep OBBHead into the config file so do the same?

Thanks

glenn-jocher commented 5 months ago

Hi @shreejalt,

Thank you for your patience and for bringing up this important question! To integrate Oriented Bounding Boxes (OBB) with RTDETR, you'll indeed need to make some modifications to the model configuration and possibly the model architecture itself.

Steps to Integrate OBB with RTDETR

  1. Modify the RTDETR Configuration: You will need to add an OBB head to the RTDETR configuration. This involves updating the YAML configuration file to include the OBB head, similar to how it's done in yolov8-obb.yaml.

    Here's an example of how you might modify the rtdetr-x.yaml file to include an OBB head:

    # rtdetr-x-obb.yaml
    backbone:
     # Backbone configuration
     ...
    
    head:
     type: OBBHead
     num_classes: 80  # Adjust based on your dataset
     ...
    
    # Additional configurations
    ...
  2. Update the Model Architecture: Depending on the implementation of the RTDETRDecoder, you might need to adjust its output to be compatible with the OBB head. This could involve modifying the decoder to output the necessary features for the OBB head.

    If you have a custom RTDETRDecoder, you might need to ensure it outputs the correct format for the OBB head. Here's a simplified example:

    class RTDETRDecoder(nn.Module):
       def __init__(self, ...):
           super(RTDETRDecoder, self).__init__()
           # Initialize layers
           ...
    
       def forward(self, x):
           # Forward pass logic
           ...
           return features_for_obb_head
  3. Train the Model: With the modified configuration and model architecture, you can now train the RTDETR model with OBB support:

    yolo detect train data=path/to/your_dataset.yaml model=path/to/rtdetr-x-obb.yaml epochs=100 imgsz=640

Verify and Debug

Ensure that your modifications are correctly implemented by running a few test epochs and checking the outputs. If you encounter any issues, please provide a minimum reproducible example to help us investigate further. You can refer to our minimum reproducible example guide for more details.

If you have any further questions or run into any issues, feel free to reach out. We're here to help! Happy training! πŸš€

shreejalt commented 5 months ago

Hi @glenn-jocher. Can you point out what exactly needs to change in RTDETR Decoder? If I change this line self.dec_bbox_head = nn.ModuleList([MLP(hd, hd, 4, num_layers=3) for _ in range(ndl)]) and change from 4 to 5, will it support your OBB Head implemented in the repo?

glenn-jocher commented 5 months ago

Hi @shreejalt,

Thank you for your detailed question! To integrate the OBB head with the RTDETR model, you will indeed need to modify the RTDETRDecoder to output the necessary features for the OBB head.

Modifying the RTDETR Decoder

The line you mentioned:

self.dec_bbox_head = nn.ModuleList([MLP(hd, hd, 4, num_layers=3) for _ in range(ndl)])

is responsible for predicting the bounding box coordinates. Changing the output dimension from 4 to 5 is a step in the right direction, but for OBB, you typically need to output more parameters to represent the oriented bounding box (e.g., 8 coordinates for the four corners).

Example Modification

Here's an example of how you might modify the decoder to output the necessary features for the OBB head:

self.dec_bbox_head = nn.ModuleList([MLP(hd, hd, 8, num_layers=3) for _ in range(ndl)])

This change ensures that the decoder outputs 8 values per bounding box, which can then be used by the OBB head to represent the four corner points.

Integrating with OBB Head

After modifying the decoder, ensure that the OBB head is correctly integrated into the model. The OBB head should be able to process the 8-dimensional output from the decoder and apply the necessary transformations.

Training the Model

With these changes, you should be able to train the RTDETR model with OBB support. Make sure to test the modified model thoroughly to ensure that it works as expected.

If you encounter any issues or need further assistance, please provide a minimum reproducible example to help us investigate further. You can refer to our minimum reproducible example guide for more details.

Feel free to reach out with any additional questions. We're here to help! Happy training! πŸš€

shreejalt commented 5 months ago

Hi @shreejalt,

Thank you for your detailed question! To integrate the OBB head with the RTDETR model, you will indeed need to modify the RTDETRDecoder to output the necessary features for the OBB head.

Modifying the RTDETR Decoder

The line you mentioned:

self.dec_bbox_head = nn.ModuleList([MLP(hd, hd, 4, num_layers=3) for _ in range(ndl)])

is responsible for predicting the bounding box coordinates. Changing the output dimension from 4 to 5 is a step in the right direction, but for OBB, you typically need to output more parameters to represent the oriented bounding box (e.g., 8 coordinates for the four corners).

Example Modification

Here's an example of how you might modify the decoder to output the necessary features for the OBB head:

self.dec_bbox_head = nn.ModuleList([MLP(hd, hd, 8, num_layers=3) for _ in range(ndl)])

This change ensures that the decoder outputs 8 values per bounding box, which can then be used by the OBB head to represent the four corner points.

Integrating with OBB Head

After modifying the decoder, ensure that the OBB head is correctly integrated into the model. The OBB head should be able to process the 8-dimensional output from the decoder and apply the necessary transformations.

Training the Model

With these changes, you should be able to train the RTDETR model with OBB support. Make sure to test the modified model thoroughly to ensure that it works as expected.

If you encounter any issues or need further assistance, please provide a minimum reproducible example to help us investigate further. You can refer to our minimum reproducible example guide for more details.

Feel free to reach out with any additional questions. We're here to help! Happy training! πŸš€

@glenn-jocher Thanks alot for your responses. It means a lot.

I saw that the OBB is a class which is present in heads.py file. Also RTDETRDecoder is a class in the same file. IF I make the changes that you have mentioned, do I need to copy the forward pass code of the OBB into RTDETRDecoder to make it run?

I am doing it right now, but I was not able to produce the same.

THanks, Shreejal

glenn-jocher commented 5 months ago

Hi @shreejalt,

Thank you for your kind words and for your detailed follow-up! 😊

To integrate the OBB head with the RTDETRDecoder, you don't need to copy the entire forward pass code of the OBB class into the RTDETRDecoder. Instead, you should ensure that the RTDETRDecoder outputs the necessary features that the OBB head can process.

Here's a more detailed approach:

Modifying the RTDETR Decoder

  1. Adjust Output Dimensions: Modify the RTDETRDecoder to output 8 values per bounding box for the OBB head:

    self.dec_bbox_head = nn.ModuleList([MLP(hd, hd, 8, num_layers=3) for _ in range(ndl)])
  2. Integrate OBB Head: Ensure that the OBB head is correctly integrated into the model. You can do this by adding the OBB head as a separate module and passing the decoder's output to it.

    class RTDETR(nn.Module):
       def __init__(self, ...):
           super(RTDETR, self).__init__()
           self.backbone = ...  # Backbone configuration
           self.decoder = RTDETRDecoder(...)
           self.obb_head = OBBHead(num_classes=80)  # Adjust based on your dataset
    
       def forward(self, x):
           features = self.backbone(x)
           decoder_output = self.decoder(features)
           obb_output = self.obb_head(decoder_output)
           return obb_output

Training the Model

With these changes, you should be able to train the RTDETR model with OBB support:

yolo detect train data=path/to/your_dataset.yaml model=path/to/rtdetr-x-obb.yaml epochs=100 imgsz=640

Testing and Debugging

Ensure that your modifications are correctly implemented by running a few test epochs and checking the outputs. If you encounter any issues, please provide a minimum reproducible example to help us investigate further. You can refer to our minimum reproducible example guide for more details.

Feel free to reach out with any additional questions. We're here to help! Happy training! πŸš€

shreejalt commented 5 months ago

@glenn-jocher Thanks for your detailed response. Here OBBHead you are referring to is the OBB class in the head.py file correct? It also takes different arguments apart from nc

Thanks,

glenn-jocher commented 5 months ago

Hi @shreejalt,

You're welcome! Yes, you are correct. The OBBHead I referred to is indeed the OBB class in the head.py file. This class is designed to handle Oriented Bounding Boxes and takes additional arguments apart from nc (number of classes).

To integrate the OBB head with the RTDETRDecoder, you can follow these steps:

  1. Initialize the OBB Head: Ensure you initialize the OBB head with the required arguments. For example:

    from ultralytics.nn.modules.head import OBB
    
    class RTDETR(nn.Module):
       def __init__(self, ...):
           super(RTDETR, self).__init__()
           self.backbone = ...  # Backbone configuration
           self.decoder = RTDETRDecoder(...)
           self.obb_head = OBB(nc=80, ne=1, ch=[256, 512, 1024])  # Adjust based on your dataset and model architecture
    
       def forward(self, x):
           features = self.backbone(x)
           decoder_output = self.decoder(features)
           obb_output = self.obb_head(decoder_output)
           return obb_output
  2. Modify the Decoder Output: Ensure the RTDETRDecoder outputs the necessary features for the OBB head. This might involve adjusting the output dimensions as discussed earlier.

  3. Train the Model: With these changes, you should be able to train the RTDETR model with OBB support:

    yolo detect train data=path/to/your_dataset.yaml model=path/to/rtdetr-x-obb.yaml epochs=100 imgsz=640

If you encounter any issues, please provide a minimum reproducible example to help us investigate further. You can refer to our minimum reproducible example guide for more details.

Feel free to reach out with any additional questions. We're here to help! Happy training! πŸš€

shreejalt commented 4 months ago

@glenn-jocher Thanks for your responses. It means a lot. I was curious to know that you mentioned that the output of the OBBHead is 8x1. But I see that in YOLOv8 - OBB the output is x, y, w, h, and the rotation angle. which sums upto 5. So do I need to change it for RTDETR Traininig?

glenn-jocher commented 4 months ago

Hi @shreejalt,

Thank you for your detailed observations and for your patience! You are correct that in YOLOv8-OBB, the output typically includes x, y, w, h, and the rotation angle, summing up to 5 parameters.

For RTDETR training with OBB, you should align the output dimensions accordingly. Here's how you can adjust the RTDETRDecoder:

  1. Adjust Output Dimensions: Modify the RTDETRDecoder to output 5 values per bounding box:

    self.dec_bbox_head = nn.ModuleList([MLP(hd, hd, 5, num_layers=3) for _ in range(ndl)])
  2. Integrate OBB Head: Ensure the OBB head processes these 5-dimensional outputs correctly. Here’s an example integration:

    from ultralytics.nn.modules.head import OBB
    
    class RTDETR(nn.Module):
       def __init__(self, ...):
           super(RTDETR, self).__init__()
           self.backbone = ...  # Backbone configuration
           self.decoder = RTDETRDecoder(...)
           self.obb_head = OBB(nc=80, ne=1, ch=[256, 512, 1024])  # Adjust based on your dataset and model architecture
    
       def forward(self, x):
           features = self.backbone(x)
           decoder_output = self.decoder(features)
           obb_output = self.obb_head(decoder_output)
           return obb_output
  3. Train the Model: With these changes, you should be able to train the RTDETR model with OBB support:

    yolo detect train data=path/to/your_dataset.yaml model=path/to/rtdetr-x-obb.yaml epochs=100 imgsz=640

If you encounter any issues, please provide a minimum reproducible example to help us investigate further. You can refer to our minimum reproducible example guide for more details.

Feel free to reach out with any additional questions. We're here to help! Happy training! πŸš€

shreejalt commented 4 months ago

@glenn-jocher Thanks for your help. I just wanted to ask one last doubt. So, after I change the dec_bbox_head to output 5 values, can I directly add OBBHead after RTDETRDecoder the way you have mentioned in yolov8-obb.yaml? I don't need to make a class RTDETR for the same correct?

  - [-1, 1, nn.Upsample, [None, 2, "nearest"]]
  - [[-1, 6], 1, Concat, [1]] # cat backbone P4
  - [-1, 3, C2f, [512]] # 12

  - [-1, 1, nn.Upsample, [None, 2, "nearest"]]
  - [[-1, 4], 1, Concat, [1]] # cat backbone P3
  - [-1, 3, C2f, [256]] # 15 (P3/8-small)

  - [-1, 1, Conv, [256, 3, 2]]
  - [[-1, 12], 1, Concat, [1]] # cat head P4
  - [-1, 3, C2f, [512]] # 18 (P4/16-medium)

  - [-1, 1, Conv, [512, 3, 2]]
  - [[-1, 9], 1, Concat, [1]] # cat head P5
  - [-1, 3, C2f, [1024]] # 21 (P5/32-large)

  - [[15, 18, 21], 1, OBB, [nc, 1]] # OBB(P3, P4, P5)

In the above file, you have mentioned that OBB directly in the cfg file. So, after I change the dec_bbox to output 5 values can I add a new line after RTDETRDecoder head?

If you can help me, it would be very much appreciated.

Thanks a lot for you responses again

glenn-jocher commented 4 months ago

Hi @shreejalt,

Thank you for your detailed follow-up! 😊

Yes, you are on the right track. After modifying the dec_bbox_head to output 5 values, you can indeed integrate the OBB head directly in your configuration file, similar to how it's done in yolov8-obb.yaml.

Integration Steps

  1. Modify dec_bbox_head: Ensure your RTDETRDecoder outputs 5 values per bounding box:

    self.dec_bbox_head = nn.ModuleList([MLP(hd, hd, 5, num_layers=3) for _ in range(ndl)])
  2. Update Configuration File: Add the OBB head directly in your configuration file after the RTDETRDecoder head. Here's an example snippet:

    # rtdetr-x-obb.yaml
    backbone:
     # Backbone configuration
     ...
    
    head:
     type: RTDETRDecoder
     ...
    
    # Add OBB head
    - [[15, 18, 21], 1, OBB, [nc, 1]]  # OBB(P3, P4, P5)
  3. Train the Model: With these changes, you should be able to train the RTDETR model with OBB support:

    yolo detect train data=path/to/your_dataset.yaml model=path/to/rtdetr-x-obb.yaml epochs=100 imgsz=640

Example Configuration

Here’s a more complete example of how your configuration might look:

# rtdetr-x-obb.yaml
backbone:
  # Backbone configuration
  ...

head:
  type: RTDETRDecoder
  nc: 80
  ch: [256, 512, 1024]
  hd: 256
  nq: 300
  ndp: 4
  nh: 8
  ndl: 6
  d_ffn: 1024
  dropout: 0.0
  act: nn.ReLU()
  eval_idx: -1
  nd: 100
  label_noise_ratio: 0.5
  box_noise_scale: 1.0
  learnt_init_query: False

# Add OBB head
- [[15, 18, 21], 1, OBB, [nc, 1]]  # OBB(P3, P4, P5)

Final Notes

By following these steps, you should be able to integrate the OBB head seamlessly with the RTDETR model. If you encounter any issues or need further assistance, please provide a minimum reproducible example to help us investigate further. You can refer to our minimum reproducible example guide for more details.

Feel free to reach out with any additional questions. We're here to help! Happy training! πŸš€

shreejalt commented 4 months ago

Hi @glenn-jocher I wanted to ask that both the OBB and RTDETRDecoder class have output nc. Don't I need to remove the nc from RTDETR decoder or OBBHead? Because this configuration file is giving an error.

Also, there are different losses for the RTDETRDecoder and OBB heads. I was confused tat what changes I need to do to incorporate the loss function in RTDETR for OBB training

glenn-jocher commented 4 months ago

Hi @shreejalt,

Thank you for your thoughtful questions! 😊

Addressing nc in RTDETRDecoder and OBBHead

You are correct that both RTDETRDecoder and OBBHead have nc (number of classes). You don't need to remove nc from either class, but ensure they are consistent and correctly integrated in your configuration.

Integrating Loss Functions

For the loss functions, you will need to ensure that both the RTDETRDecoder and OBB head losses are correctly calculated and combined. Here's a concise approach:

  1. Modify the Forward Pass: Ensure the forward pass of your model returns outputs compatible with both loss functions.

  2. Combine Losses: In your training loop, calculate both losses and combine them. Here's a simplified example:

    def compute_loss(pred, targets):
       loss_rtdetr = rtdetr_loss(pred['rtdetr'], targets)
       loss_obb = obb_loss(pred['obb'], targets)
       total_loss = loss_rtdetr + loss_obb
       return total_loss

Example Configuration

Ensure your configuration file correctly integrates both heads:

head:
  type: RTDETRDecoder
  nc: 80
  ...

# Add OBB head
- [[15, 18, 21], 1, OBB, [nc, 1]]  # OBB(P3, P4, P5)

Reproducible Example

If you encounter specific errors, providing a minimum reproducible example can help us diagnose the issue more effectively. You can refer to our minimum reproducible example guide for more details.

Feel free to update to the latest versions of the packages to ensure compatibility. If you have any further questions, we're here to help! πŸš€

shreejalt commented 4 months ago

@glenn-jocher Thanks for your reply.

I was confused that why I need to pass the RTDETR loss as it is approximated for the Axis Aligned bounding boxes. Cant i just use the OBB loss for training?

glenn-jocher commented 4 months ago

Hi @shreejalt,

Great question! You can indeed focus on using the OBB loss for training if your primary objective is to optimize for oriented bounding boxes. Here’s a concise approach:

  1. Modify the Forward Pass: Ensure your model's forward pass outputs are compatible with the OBB loss function.

  2. Use OBB Loss: In your training loop, compute only the OBB loss:

    def compute_loss(pred, targets):
       loss_obb = obb_loss(pred, targets)
       return loss_obb
  3. Configuration: Ensure your configuration file integrates the OBB head correctly:

    head:
     type: RTDETRDecoder
     nc: 80
     ...
    
    # Add OBB head
    - [[15, 18, 21], 1, OBB, [nc, 1]]  # OBB(P3, P4, P5)

By focusing on the OBB loss, you can streamline the training process for oriented bounding boxes. If you encounter any issues, please provide a minimum reproducible example to help us investigate further. You can refer to our minimum reproducible example guide for more details.

Feel free to reach out with any additional questions. We're here to help! πŸš€

zxjzzz commented 3 months ago

@glenn-jocher Thanks for your reply.

I was confused that why I need to pass the RTDETR loss as it is approximated for the Axis Aligned bounding boxes. Cant i just use the OBB loss for training?

hi how do you do it

glenn-jocher commented 3 months ago

Hi @zxjzzz,

You can indeed focus solely on the OBB loss for training if your primary goal is to optimize for oriented bounding boxes. This approach simplifies the process and ensures that the model is specifically tuned for OBB. Ensure your model's forward pass outputs are compatible with the OBB loss function, and use the OBB loss in your training loop. If you encounter any issues, please verify reproducibility with the latest package versions. Feel free to reach out with further questions.

github-actions[bot] commented 4 days ago

πŸ‘‹ Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO πŸš€ and Vision AI ⭐