ultralytics / yolov5

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
50.86k stars 16.38k forks source link

YOLO works, but how do I extract the data? #12447

Closed DylDevs closed 11 months ago

DylDevs commented 11 months ago

Search before asking

Question

I have a custom trained YOLOv6 model for detecting vehicles in a game. I need to know how to get this data from the model:

Thanks in advance.

Additional

No response

github-actions[bot] commented 11 months ago

👋 Hello @DylDevs, thank you for your interest in YOLOv5 🚀! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.

Requirements

Python>=3.8.0 with all requirements.txt installed including PyTorch>=1.8. To get started:

git clone https://github.com/ultralytics/yolov5  # clone
cd yolov5
pip install -r requirements.txt  # install

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

YOLOv5 CI

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training, validation, inference, export and benchmarks on macOS, Windows, and Ubuntu every 24 hours and on every commit.

Introducing YOLOv8 🚀

We're excited to announce the launch of our latest state-of-the-art (SOTA) object detection model for 2023 - YOLOv8 🚀!

Designed to be fast, accurate, and easy to use, YOLOv8 is an ideal choice for a wide range of object detection, image segmentation and image classification tasks. With YOLOv8, you'll be able to quickly and accurately detect objects in real-time, streamline your workflows, and achieve new levels of accuracy in your projects.

Check out our YOLOv8 Docs for details and get started with:

pip install ultralytics
glenn-jocher commented 11 months ago

@DylDevs hi there! 😊

Great to hear that YOLOv5 is working for you! To extract the data you're looking for, you can refer to our documentation at https://docs.ultralytics.com/yolov5/ for guidance on accessing the bounding box coordinates, object class, and confidence scores. As for distance estimation, YOLOv5 does not directly output object distance, but you can explore methods like depth estimation or using stereo vision to infer distance from the object. Good luck with your project!

DylDevs commented 11 months ago

Sorry if i'm blind or something, Ive been looking for a while and I cant find anything regarding those topics?

glenn-jocher commented 11 months ago

@DylDevs No worries at all! You can access the bounding box coordinates, object class, and confidence scores from the inference results using the returned detections. As for distance estimation, while it's not directly provided by YOLOv5, you can explore computer vision techniques like monocular depth estimation or stereo vision for inferring object distance based on the bounding box size and other visual cues. Let me know if you need more information!

DylDevs commented 11 months ago

Well, i'm not exactly sure what you mean? If I print the results I get this output, is this where I get the info somehow?

boxes: ultralytics.engine.results.Boxes object
keypoints: None
masks: None
names: {0: 'car', 1: 'truck', 2: 'bus'}
orig_img: array([[[ 24,  12,   0],
        [ 24,  12,   0],
        [ 24,  12,   0],
        ...,
        [ 24,  12,   0],
        [ 24,  12,   0],
        [ 24,  12,   0]],

       [[ 24,  12,   0],
        [ 24,  12,   0],
        [ 24,  12,   0],
        ...,
        [ 24,  12,   0],
        [ 24,  12,   0],
        [ 24,  12,   0]],

       [[ 24,  12,   0],
        [ 24,  12,   0],
        [ 24,  12,   0],
        ...,
        [ 24,  12,   0],
        [ 24,  12,   0],
        [ 24,  12,   0]],

       ...,

       [[ 24,  12,   0],
        [ 24,  12,   0],
        [ 24,  12,   0],
        ...,
        [255, 255, 255],
        [255, 255, 255],
        [255, 255, 255]],

       [[ 24,  12,   0],
        [ 24,  12,   0],
        [ 24,  12,   0],
        ...,
        [255, 255, 255],
        [255, 255, 255],
        [255, 255, 255]],

       [[ 24,  12,   0],
        [ 24,  12,   0],
        [ 24,  12,   0],
        ...,
        [255, 255, 255],
        [255, 255, 255],
        [255, 255, 255]]], dtype=uint8)
orig_shape: (233, 548)
path: 'image0.jpg'
probs: None
save_dir: None
speed: {'preprocess': 2.9969215393066406, 'inference': 48.00105094909668, 'postprocess': 0.9989738464355469}]
glenn-jocher commented 11 months ago

@DylDevs The output you shared indicates the detection results. To access the information you need, you can use the boxes and names attributes. The boxes attribute contains the bounding box coordinates, while the names attribute provides the class names. Additionally, you can use the probs attribute to access the probability/confidence scores.

If you need specific guidance on how to extract this information, you can refer to the documentation or feel free to ask for further assistance.

DylDevs commented 11 months ago

Well, I am having trouble extracting the data. If I use this to get the data

names = results[3]
boxes = results[0]
confidence = results[7]

It gives me this error:

    names = results[3]
            ~~~~~~~^^^
IndexError: list index out of range

Which if you look in my prior message, it is in fact in the range

After that i tried this, I did research and came up with this:

results_oredered = OrderedDict(sorted(results[0].items(), key=lambda t: t[0]))
names = list(results_oredered).index("names")
boxes = list(results_oredered).index("boxes")
confidence = list(results_oredered).index("probs")

Now This is what I get:

    raise AttributeError(f"'{name}' object has no attribute '{attr}'. See valid attributes below.\n{self.__doc__}")
AttributeError: 'Results' object has no attribute 'items'. See valid attributes below.

    A class for storing and manipulating inference results.

    Args:
        orig_img (numpy.ndarray): The original image as a numpy array.
        path (str): The path to the image file.
        names (dict): A dictionary of class names.
        boxes (torch.tensor, optional): A 2D tensor of bounding box coordinates for each detection.
        masks (torch.tensor, optional): A 3D tensor of detection masks, where each mask is a binary image.
        probs (torch.tensor, optional): A 1D tensor of probabilities of each class for classification task.
        keypoints (List[List[float]], optional): A list of detected keypoints for each object.

    Attributes:
        orig_img (numpy.ndarray): The original image as a numpy array.
        orig_shape (tuple): The original image shape in (height, width) format.
        boxes (Boxes, optional): A Boxes object containing the detection bounding boxes.
        masks (Masks, optional): A Masks object containing the detection masks.
        probs (Probs, optional): A Probs object containing probabilities of each class for classification task.
        keypoints (Keypoints, optional): A Keypoints object containing detected keypoints for each object.
        speed (dict): A dictionary of preprocess, inference, and postprocess speeds in milliseconds per image.
        names (dict): A dictionary of class names.
        path (str): The path to the image file.
        _keys (tuple): A tuple of attribute names for non-empty attributes.
glenn-jocher commented 11 months ago

@DylDevs I see, it seems there might be some confusion in accessing the detection results. To access the required information, you can directly access the attributes of the results object. Here's an example of how you can access the data you need:

# Accessing bounding box coordinates, class names, and confidence scores
names = results.names
boxes = results.boxes.tensor
confidence = results.pred[0][:, 4]  # 0th index for the first image, and 4th index for confidence score

Hope this helps! Let me know if you have further questions.

DylDevs commented 11 months ago

Now I get this:

    names = results.names
            ^^^^^^^^^^^^^
AttributeError: 'list' object has no attribute 'names'
DylDevs commented 11 months ago

After doing research, I have it down to this, which works

    names = results[0].names
    base_boxes = results[0].boxes.xyxy.cpu()
    boxes = str(base_boxes).replace("tensor", "")

Only problem I have now is that I dont know what each of the numbers correspond to

([[ 92.4750, 134.5188, 311.4937, 233.0000]])
glenn-jocher commented 11 months ago

@DylDevs Glad to hear that you've made progress! The output you're seeing represents the coordinates of the bounding box in the format (x_min, y_min, x_max, y_max). These values correspond to the top-left and bottom-right coordinates of the bounding box in the image. If you need further assistance in interpreting or using this information, feel free to ask!

DylDevs commented 11 months ago

Thanks for all the help, I was able to extract all of the data that I needed

glenn-jocher commented 11 months ago

@DylDevs you're welcome! Glad to hear that you were able to extract the necessary data. If you have any more questions or need further assistance in the future, feel free to ask. Good luck with your project!