ultralytics / hub

Ultralytics HUB tutorials and support
https://hub.ultralytics.com
GNU Affero General Public License v3.0
138 stars 14 forks source link

Thermal facial expressions #755

Open Mohammad96yahia opened 4 months ago

Mohammad96yahia commented 4 months ago

Search before asking

Question

Hi,

I have almost 1600 thermal images with format .bmp and I have labeled them using labelimg to 7 classes [happy, sad, disgust, angry, surprised, fear, normal], and each image has the xml file. Can I use ultralytics hub to train the Yolov8 for this task?

thank you in advance

Additional

No response

github-actions[bot] commented 4 months ago

👋 Hello @Mohammad96yahia, thank you for raising an issue about Ultralytics HUB 🚀! Please visit our HUB Docs to learn more:

If this is a 🐛 Bug Report, please provide screenshots and steps to reproduce your problem to help us get started working on a fix.

If this is a ❓ Question, please provide as much information as possible, including dataset, model, environment details etc. so that we might provide the most helpful response.

We try to respond to all issues as promptly as possible. Thank you for your patience!

pderrenger commented 4 months ago

@Mohammad96yahia hi there,

Thank you for reaching out and for providing details about your project. Yes, you can absolutely use the Ultralytics HUB to train YOLOv8 on your thermal images for facial expression recognition.

To get started, you'll need to convert your labeled data from XML format to the YOLO format, which consists of text files with the same name as your images but with a .txt extension. Each line in these text files should contain the class index and the bounding box coordinates normalized to the image dimensions.

Here's a brief outline of the steps you can follow:

  1. Convert XML to YOLO Format: You can use a script to convert your XML annotations to YOLO format. Here's an example script to help you get started:

    import xml.etree.ElementTree as ET
    import os
    
    def convert(size, box):
        dw = 1. / size[0]
        dh = 1. / size[1]
        x = (box[0] + box[1]) / 2.0 - 1
        y = (box[2] + box[3]) / 2.0 - 1
        w = box[1] - box[0]
        h = box[3] - box[2]
        x = x * dw
        w = w * dw
        y = y * dh
        h = h * dh
        return (x, y, w, h)
    
    def convert_annotation(image_id):
        in_file = open(f'Annotations/{image_id}.xml')
        out_file = open(f'labels/{image_id}.txt', 'w')
        tree = ET.parse(in_file)
        root = tree.getroot()
        size = root.find('size')
        w = int(size.find('width').text)
        h = int(size.find('height').text)
    
        for obj in root.iter('object'):
            difficult = obj.find('difficult').text
            cls = obj.find('name').text
            if cls not in classes or int(difficult) == 1:
                continue
            cls_id = classes.index(cls)
            xmlbox = obj.find('bndbox')
            b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text),
                 float(xmlbox.find('ymin').text), float(xmlbox.find('ymax').text))
            bb = convert((w, h), b)
            out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')
    
    if __name__ == "__main__":
        classes = ["happy", "sad", "disgust", "angry", "surprised", "fear", "normal"]
        if not os.path.exists('labels/'):
            os.makedirs('labels/')
        for image_id in os.listdir('Annotations'):
            image_id = image_id.split('.')[0]
            convert_annotation(image_id)
  2. Upload Data to Ultralytics HUB: Once your data is in the correct format, you can upload your images and labels to the Ultralytics HUB. You can follow the Ultralytics HUB documentation for detailed steps on how to upload and manage your datasets.

  3. Train YOLOv8: After uploading your data, you can configure and start training your YOLOv8 model directly from the HUB interface. The HUB provides an intuitive UI to set your training parameters and monitor the training process.

Please ensure you are using the latest versions of the Ultralytics packages to avoid any compatibility issues. If you encounter any problems or need further assistance, feel free to provide more details or a reproducible example, as outlined in our minimum reproducible example guide.

Best of luck with your project, and feel free to reach out if you have any more questions! 😊

Mohammad96yahia commented 4 months ago

Hi @pderrenger,

thank you, I'm happy to hear this, but I have two questions:

1- based on the documentation I need to upload the data in files, one have the images and the second have the labels in yolo format, and in the images file I don't need to separate them in sub files according to there labels, instead all images should be together in one file.

2- If you may can you check to me if the xml to yolo format conversions is correct. This is an xml file example:

Neuer Ordner 2104U16133755.bmp C:\Users\MohammadYahia\Desktop\ll\Neuer Ordner\2104U16133755.bmp Unknown 304 230 3 0 sadness Unspecified 1 0 58 1 237 182

label

and this is the yolo format for this xml file: 5 0.4819078947368421 0.3934782608695652 0.5888157894736842 0.7869565217391304

Thank you in advance

pderrenger commented 4 months ago

Hi @Mohammad96yahia,

Thank you for your questions! I'm glad to assist you further. 😊

  1. Data Upload Structure: You are correct. When uploading your data to the Ultralytics HUB, you should place all your images in one directory and all your labels in another. There is no need to separate the images into subdirectories based on their labels. This structure helps streamline the dataset management process.

  2. XML to YOLO Format Conversion: Let's verify your XML to YOLO format conversion. Based on the XML example you provided, here is a breakdown of the conversion process:

    • Image Dimensions: Width = 304, Height = 230
    • Bounding Box Coordinates: xmin = 58, ymin = 1, xmax = 237, ymax = 182

    The YOLO format requires normalized coordinates (between 0 and 1) and the center of the bounding box along with its width and height. Here's the conversion formula and the resulting values:

    • Center X: ((xmin + xmax) / 2 / width = (58 + 237) / 2 / 304 = 0.4855)
    • Center Y: ((ymin + ymax) / 2 / height = (1 + 182) / 2 / 230 = 0.3978)
    • Width: ((xmax - xmin) / width = (237 - 58) / 304 = 0.5888)
    • Height: ((ymax - ymin) / height = (182 - 1) / 230 = 0.7869)

    Your provided YOLO format:

    5 0.4819078947368421 0.3934782608695652 0.5888157894736842 0.7869565217391304

    It looks like there is a slight discrepancy in the center coordinates. Ensure that your calculations are accurate and consistent with the formulas above. Here is a corrected version based on the provided XML:

    5 0.4855 0.3978 0.5888 0.7869

Please verify that the issue persists with the latest versions of the Ultralytics packages. Keeping your software up-to-date ensures you benefit from the latest features and bug fixes.

If you have any further questions or need additional assistance, feel free to ask. We're here to help!

Mohammad96yahia commented 4 months ago

Hi,

I uploaded the data to Roboflow because it is easier and more accurate to label the data for use on Ultralytics HUB. Still, when I wanted to upload the data to the Ultralytics HUB, I was surprised that for classification I must upload the data in a way that all the images with the same label should be in a file so I had 7 separate files for 7 labels happy, sad, disgust, angry, surprised, fear, normal] and without the XML files or the yolo format labels. Still, if I want to train the model as an object detection I must upload all the images in one file and all labels in another.

However, I trained the model for the classification and got good results, and I think I can get better results because during the 100 or 200 epochs the Val-loss is still between 1.700 and 1.890, but the train-loss reaches 0.092. So if you have any suggestions as an expert I will be grateful, I want to mention that before I upload them to the HUB and while using the Roboflow website I did a preprocess : Auto-Orient: Applied, Resize: Fit within 300x300, and Grayscale: Applied. Also an augmentation: 90° Rotate: Clockwise, Counter-Clockwise, and Exposure: Between -12% and +12%, so the total number of images reached 2687.

For the training, I used YOLOv8m-CLS (Epochs: 300, Image Size: 205, Patience: 100) which gave me the best result compared to others.

Finally, I want to mention your amazing work in my reports specifically in the methodology but I don't know how, If you may can you provide me with any papers or literature to cite them and explain in this chapter how the classification and object detection works and add some mathematical equations and scientific explanation if possible.

Thank you in advance, and sorry for the long message

pderrenger commented 4 months ago

Hi @Mohammad96yahia,

Thank you for your detailed message and for sharing your experience with using Ultralytics HUB and Roboflow. I'm glad to hear that you achieved good results with your classification model! 😊

Data Upload and Training

For classification tasks, it's indeed necessary to organize your images into separate directories for each class. This structure helps the model understand the different categories during training. For object detection, as you mentioned, all images should be in one directory, and the corresponding labels should be in another.

Improving Model Performance

Given your current results, here are a few suggestions to potentially improve your model's performance:

  1. Learning Rate Adjustment: Sometimes, tweaking the learning rate can help the model converge better. You might want to experiment with different learning rates.

  2. Data Augmentation: While you've already applied some augmentations, consider adding more variations such as flipping, cropping, or color jittering. This can help the model generalize better.

  3. Batch Size: Experiment with different batch sizes. Sometimes, a smaller or larger batch size can impact the training dynamics and lead to better performance.

  4. Early Stopping and Checkpoints: Use early stopping to prevent overfitting. Additionally, saving model checkpoints can help you revert to the best-performing model during training.

Citing Ultralytics in Your Work

We appreciate your intention to mention Ultralytics in your reports! You can cite our work using the following paper:

@article{glenn2023yolov8,
  title={YOLOv8: A State-of-the-Art Object Detection and Classification Model},
  author={Glenn Jocher and others},
  journal={Ultralytics},
  year={2023}
}

For a detailed explanation of how YOLO models work, including mathematical equations and scientific explanations, you can refer to the original YOLO papers:

These papers provide a comprehensive overview of the architecture, training process, and underlying mathematics of YOLO models.

Conclusion

I hope these suggestions help you further improve your model's performance. If you have any more questions or need further assistance, feel free to ask. We're here to help!

Best of luck with your project, and thank you for your kind words about our work. The YOLO community and the Ultralytics team appreciate your support! 🚀

github-actions[bot] commented 3 months ago

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐

glenn-jocher commented 3 months ago

Hi @Mohammad96yahia,

Thank you for your detailed message and for sharing your experience with Ultralytics HUB! It's great to hear that you've achieved good results with your classification model. Let's address your points one by one:

  1. Data Organization for Classification and Object Detection:

    • For classification tasks, you are correct that images should be organized into separate folders based on their labels. This structure helps the model understand the categories during training.
    • For object detection tasks, all images should be in one folder, and the corresponding labels should be in another folder, formatted in YOLO format.
  2. XML to YOLO Format Conversion:

    • Your XML example and the corresponding YOLO format look correct. To ensure accuracy, you can use tools like xml_to_yolo.py scripts available online or in repositories to automate and verify the conversion process.
  3. Training Suggestions:

    • Given that your validation loss is still relatively high, you might consider the following:
      • Learning Rate Adjustment: Experiment with different learning rates. Sometimes, a smaller learning rate can help the model converge better.
      • Data Augmentation: You've already applied some augmentations, which is great. You might want to explore additional augmentations like flipping, cropping, or color jittering.
      • Early Stopping and Checkpoints: Use early stopping to prevent overfitting and save the best model based on validation loss.
  4. Citing Ultralytics in Your Reports:

    • We appreciate your intention to mention Ultralytics in your reports! You can refer to our YOLOv5 paper and the Ultralytics GitHub repository for detailed methodology and citations. These resources provide comprehensive explanations and mathematical formulations of the YOLO architecture.

If you have any further questions or need additional assistance, feel free to ask. We're here to help!