ultralytics / ultralytics

NEW - YOLOv8 🚀 in PyTorch > ONNX > OpenVINO > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
28.62k stars 5.69k forks source link

MOTA and MOTP calculation #8252

Closed Shuaib11-Github closed 5 months ago

Shuaib11-Github commented 7 months ago

Search before asking

Question

I have for tracking for a single output as below, Object IDs: tensor([ 1., 2., 3., 4., 5., 6., 7., 8., 9., 10., 11., 12., 13., 14., 15., 16., 17., 18., 19., 20., 21., 22., 23.]) Bounding Boxes: tensor([[1746.5852, 532.9240, 1919.6624, 752.6738], [ 671.1702, 433.9525, 746.4346, 495.2292], [ 631.0151, 388.8817, 681.9429, 427.3249], [ 750.9922, 409.3558, 809.0884, 456.6566], [1128.0063, 430.8089, 1202.2113, 484.1173], [1794.8990, 446.9329, 1918.4865, 507.2599], [1163.6973, 349.8423, 1202.3220, 381.4899], [1065.2070, 413.3709, 1131.9819, 464.2481], [ 852.6096, 404.4958, 911.0833, 446.3598], [1215.1777, 381.8876, 1265.1753, 421.5816], [1368.8850, 399.9973, 1424.5498, 440.3634], [ 957.9658, 404.4436, 1029.3635, 465.9520], [1518.6152, 334.2286, 1562.8381, 368.8765], [1609.2129, 333.2341, 1690.6794, 460.3948], [1027.0447, 340.4250, 1060.4711, 368.0705], [ 609.4783, 363.4427, 644.3919, 386.9360], [1131.8442, 349.2545, 1160.5519, 374.2233], [ 972.8117, 342.3387, 1005.2701, 366.1255], [ 908.2639, 302.0847, 956.2980, 336.6221], [1579.6724, 410.7565, 1613.2292, 459.0648], [1202.4021, 317.7359, 1228.8746, 340.8892], [ 761.5058, 364.0521, 793.8737, 388.1438], [ 842.4190, 361.6234, 886.5723, 391.3627]])

For ground truth the file looks like this Frame_number, object_id, bounding boxes 41 34710 952.0 0.0 79.0 14.0

but I have 300 such instances for which I need to calculate iou_matrix and MOTA and MOTP metrics. I am having difficulty in doing this. I don't know how do I calculate MOTA and MOTP for the above, here is the code

from ultralytics import YOLO import motmetrics as mm import os import numpy as np

def parse_ground_truth(file_path): ground_truth = {} with open(file_path, 'r') as file: for line in file: parts = line.strip().split() frame_number, obj_id, x_min, y_min, x_max, y_max = int(parts[0]), int(parts[1]), float(parts[2]), float(parts[3]), float(parts[4]), float(parts[5]) bbox = (x_min, y_min, x_max, y_max)

        if frame_number not in ground_truth:
            ground_truth[frame_number] = []
        ground_truth[frame_number].append((obj_id, bbox))
return ground_truth

Load the YOLO model

model = YOLO('/home/rmarri/speed-estimation/bdd-yolo-finetune/runs/detect/train24/weights/best.pt')

Directories

video_dir = "/home/rmarri/speed-estimation/bdd-yolo-finetune/60fpsVidoes" # Update this path ground_truth_dir = "/home/rmarri/speed-estimation/bdd-yolo-finetune/60fpsvideolabels" # Update this path

Initialize MOT metrics

mh = mm.metrics.create() print("MOT initialized")

def calculate_iou(boxA, boxB):

Calculate the coordinates of the intersection rectangle

xA = max(boxA[0], boxB[0])
yA = max(boxA[1], boxB[1])
xB = min(boxA[2], boxB[2])
yB = min(boxA[3], boxB[3])

# Calculate the area of intersection
interArea = max(0, xB - xA) * max(0, yB - yA)

# Calculate the area of both bounding boxes
boxAArea = (boxA[2] - boxA[0]) * (boxA[3] - boxA[1])
boxBArea = (boxB[2] - boxB[0]) * (boxB[3] - boxB[1])

# Calculate the area of union
unionArea = boxAArea + boxBArea - interArea

# Compute IoU
iou = interArea / float(unionArea + 1e-6)  # Adding epsilon to avoid division by zero

return iou

for video_file in os.listdir(video_dir): if not video_file.endswith(".MOV"): continue # Skip non-video files

video_path = os.path.join(video_dir, video_file)
ground_truth_path = os.path.join(ground_truth_dir, video_file.replace(".MOV", ".txt"))

if not os.path.exists(ground_truth_path):
    print(f"Skipping {video_file}: ground truth file not found.")
    continue

print(f"Processing {video_file}...")
results = model.track(source=video_path, show=True, tracker='botsort.yaml')

ground_truth_data = parse_ground_truth(ground_truth_path)
acc = mm.MOTAccumulator(auto_id=True)

for frame_number, result in enumerate(results, start=1):  # Adjust starting frame number as necessary
    if frame_number not in ground_truth_data:
        continue  # Skip this frame if no ground truth data

    # Detection data
    det_boxes = result.boxes.xyxy.cpu().numpy()  # Assuming this is the correct way to access your detection bounding boxes
    det_ids = result.[boxes.id](http://boxes.id/)  # Assuming this is the correct way to access your detection IDs
    det_ids_np = np.array(det_ids)
    # Ground truth data for the current frame
    gt_ids, gt_boxes = zip(*ground_truth_data[frame_number])
    gt_boxes_np = np.array(gt_boxes)
    print('GT IDs:', gt_ids)
    print('DET IDs:', det_ids_np)
    print('GT Boxes:', gt_boxes_np)
    print('DET Boxes:', det_boxes)

    # # Example usage
    # gt_boxes = np.array([[961.94, 0, 80.45, 14.258]])
    # det_boxes = np.array([[0, 0, 1386.8, 1066.7], [18.699, 453.94, 73.065, 504.86], [75.047, 454.59, 134.77, 506.24]])

    iou_matrix = np.zeros((len(det_boxes), len(gt_boxes)))

    for i, det_box in enumerate(det_boxes):
        for j, gt_box in enumerate(gt_boxes):
            iou_matrix[i, j] = calculate_iou(det_box, gt_box)

    print("IoU Matrix:\n", iou_matrix)

    # Calculate IoU matrix
    # iou_matrix = mm.distances.iou_matrix(gt_boxes_np, det_boxes, max_iou=0.5)  # Adjust max_iou as needed
    # print("IOU_Matrix", iou_matrix)
# Update accumulator with frame's data
    acc.update(
        gt_ids,
        det_ids,
        iou_matrix
    )

Compute metrics after processing all frames

mh = mm.metrics.create() summary = mh.compute(acc, metrics=mm.metrics.motchallenge_metrics, name='acc') print(summary)

Additional

I have a ground truth data with frame_no, object_id, bounding boxes. ground truth data looks like this 41 1 952.0 0.0 79.0 14.0 42 1 954.7936639291984 0.0 83.82589528209907 14.867768595041323 44 1 961.9401867331467 0.0 80.44983453823076 14.258238809086198 50 1 969.075503083659 0.0 83.46678106945829 14.803607257496001 51 1 966.6347111485035 0.0 92.04233446195347 16.424031424653666 52 1 965.5744302572999 0.0 94.57271250223188 16.980587070598872 53 1 964.526253119986 0.0 98.34157491788552 17.83726246673933

I am getting this GT IDs: (334710,) # groundtruth DET IDs: [ 1] # detection GT Boxes: [[ 952 0 79 14]] DET Boxes: [[ 0 2.9919 1295.9 1076.3]] IoU Matrix: [[ 0]] GT IDs: (334710,) DET IDs: [ 1] GT Boxes: [[ 954.79 0 83.826 14.868]] DET Boxes: [[ 0 2.0868 1276.7 1073.6]] IoU Matrix: [[ 0]] GT IDs: (334710,) DET IDs: None GT Boxes: [[ 961.94 0 80.45 14.258]] DET Boxes: [[ 0 0 1386.8 1066.7] [ 18.699 453.94 73.065 504.86] [ 75.047 454.59 134.77 506.24]] IoU Matrix: [[ 0] [ -0] [ -0]] Traceback (most recent call last): File "botsort2.py", line 108, in acc.update( File "/home/rmarri/anaconda3/envs/yoloft/lib/python3.8/site-packages/motmetrics/mot.py", line 181, in update dists = np.atleast_2d(dists).astype(float).reshape(oids.shape[0], hids.shape[0]).copy() IndexError: tuple index out of range

And for tracking I don't know how to get frame_no. For some of the frames the object_ids is None. I want to calculate MOTA and MOTP metrics for ground truth and tracking.

Shuaib11-Github commented 7 months ago

Hello Glenn, can you provide a simple example code on how to do this. I need to calculate MOTA and MOTP metrics. But for that they should be in same format.

My ground truth file is a .txt file with frame no, Object IDs, Bounding boxes

And the tracking has for some frame has objects IDs as None and for others it has detected and bounding boxes are in tensors for each object detected. For tracking how do can we obtain frame number

This is for the 300th frame and some frames have bounding boxes but no Object IDs For the 300th frame it has detected 23 objects Object IDs: tensor([ 1., 2., 3., 4., 5., 6., 7., 8., 9., 10., 11., 12., 13., 14., 15., 16., 17., 18., 19., 20., 21., 22., 23.]) Bounding Boxes: tensor([[1746.5852, 532.9240, 1919.6624, 752.6738], [ 671.1702, 433.9525, 746.4346, 495.2292], [ 631.0151, 388.8817, 681.9429, 427.3249], [ 750.9922, 409.3558, 809.0884, 456.6566], [1128.0063, 430.8089, 1202.2113, 484.1173], [1794.8990, 446.9329, 1918.4865, 507.2599], [1163.6973, 349.8423, 1202.3220, 381.4899], [1065.2070, 413.3709, 1131.9819, 464.2481], [ 852.6096, 404.4958, 911.0833, 446.3598], [1215.1777, 381.8876, 1265.1753, 421.5816], [1368.8850, 399.9973, 1424.5498, 440.3634], [ 957.9658, 404.4436, 1029.3635, 465.9520], [1518.6152, 334.2286, 1562.8381, 368.8765], [1609.2129, 333.2341, 1690.6794, 460.3948], [1027.0447, 340.4250, 1060.4711, 368.0705], [ 609.4783, 363.4427, 644.3919, 386.9360], [1131.8442, 349.2545, 1160.5519, 374.2233], [ 972.8117, 342.3387, 1005.2701, 366.1255], [ 908.2639, 302.0847, 956.2980, 336.6221], [1579.6724, 410.7565, 1613.2292, 459.0648], [1202.4021, 317.7359, 1228.8746, 340.8892], [ 761.5058, 364.0521, 793.8737, 388.1438], [ 842.4190, 361.6234, 886.5723, 391.3627]])

I have a ground truth data for 300 frames with frame_no, object_id, bounding boxes. ground truth data looks like this in a .txt file

41 1 952.0 0.0 79.0 14.0 42 1 954.7936639291984 0.0 83.82589528209907 14.867768595041323 44 1 961.9401867331467 0.0 80.44983453823076 14.258238809086198 50 1 969.075503083659 0.0 83.46678106945829 14.803607257496001 51 1 966.6347111485035 0.0 92.04233446195347 16.424031424653666 52 1 965.5744302572999 0.0 94.57271250223188 16.980587070598872 53 1 964.526253119986 0.0 98.34157491788552 17.83726246673933 ....

On Sat, 17 Feb 2024, 1:04 am Glenn Jocher, @.***> wrote:

@Shuaib11-Github https://github.com/Shuaib11-Github hello! It looks like you're on the right track with your tracking and validation efforts. For calculating MOTA and MOTP, you'll need to ensure that your detections and ground truth annotations are correctly paired by frame and object ID. Here's a simplified approach to help you:

  1. Make sure your ground truth and detection data are synchronized by frame number.
  2. Use the motmetrics library to create an accumulator and update it with the IoU matrix for each frame.
  3. After processing all frames, compute the MOTA and MOTP metrics using the accumulator.

Here's a snippet to guide you:

Initialize MOT metrics accumulatoracc = mm.MOTAccumulator(auto_id=True)

Loop through each framefor frame_number in range(start_frame, end_frame):

# Get ground truth and detection data for the current frame
gt_data = ground_truth_data.get(frame_number, [])
det_data = detection_data.get(frame_number, [])

# Extract IDs and bounding boxes
gt_ids, gt_boxes = zip(*gt_data) if gt_data else ([], [])
det_ids, det_boxes = zip(*det_data) if det_data else ([], [])

# Calculate IoU matrix
iou_matrix = mm.distances.iou_matrix(gt_boxes, det_boxes, max_iou=0.5)

# Update the accumulator
acc.update(
    gt_ids,
    det_ids,
    iou_matrix
)

Compute MOTA and MOTPsummary = mh.compute(acc, metrics=['mota', 'motp'], name='acc')print(summary)

For the IndexError you're encountering, it seems there might be a mismatch in the dimensions of your IDs and bounding boxes. Double-check that the lengths of gt_ids and det_ids match the number of rows in your iou_matrix.

Regarding the frame number, if you're using model.track(), the frame number is typically the index of the frame in the video or stream. If object_ids is None for some frames, it means no detections were made in that frame. You can handle this by checking if det_ids is not None before updating the accumulator.

I hope this helps! If you have further questions or run into more issues, feel free to reach out. Good luck with your tracking project! 👍🏼

— Reply to this email directly, view it on GitHub https://github.com/ultralytics/ultralytics/issues/8252#issuecomment-1949206807, or unsubscribe https://github.com/notifications/unsubscribe-auth/AONNA2A3Q2774BDYRZUKKSTYT6YF5AVCNFSM6AAAAABDMMORPCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNBZGIYDMOBQG4 . You are receiving this because you were mentioned.Message ID: @.***>

Shuaib11-Github commented 7 months ago

There are some of the frames with no Object IDs detected. How do I get rid of these.

And Bounding boxes are in tensors

For the 300th frame it has detected 23 objects Object IDs: tensor([ 1., 2., 3., 4., 5., 6., 7., 8., 9., 10., 11., 12., 13., 14., 15., 16., 17., 18., 19., 20., 21., 22., 23.]) Bounding Boxes: tensor([[1746.5852, 532.9240, 1919.6624, 752.6738], [ 671.1702, 433.9525, 746.4346, 495.2292], [ 631.0151, 388.8817, 681.9429, 427.3249], [ 750.9922, 409.3558, 809.0884, 456.6566], [1128.0063, 430.8089, 1202.2113, 484.1173], [1794.8990, 446.9329, 1918.4865, 507.2599], [1163.6973, 349.8423, 1202.3220, 381.4899], [1065.2070, 413.3709, 1131.9819, 464.2481], [ 852.6096, 404.4958, 911.0833, 446.3598], [1215.1777, 381.8876, 1265.1753, 421.5816], [1368.8850, 399.9973, 1424.5498, 440.3634], [ 957.9658, 404.4436, 1029.3635, 465.9520], [1518.6152, 334.2286, 1562.8381, 368.8765], [1609.2129, 333.2341, 1690.6794, 460.3948], [1027.0447, 340.4250, 1060.4711, 368.0705], [ 609.4783, 363.4427, 644.3919, 386.9360], [1131.8442, 349.2545, 1160.5519, 374.2233], [ 972.8117, 342.3387, 1005.2701, 366.1255], [ 908.2639, 302.0847, 956.2980, 336.6221], [1579.6724, 410.7565, 1613.2292, 459.0648], [1202.4021, 317.7359, 1228.8746, 340.8892], [ 761.5058, 364.0521, 793.8737, 388.1438], [ 842.4190, 361.6234, 886.5723, 391.3627]])

I have a ground truth data for 300 frames with frame_no, object_id, bounding boxes. ground truth data looks like this in a .txt file

41 1 952.0 0.0 79.0 14.0 42 1 954.7936639291984 0.0 83.82589528209907 14.867768595041323 44 1 961.9401867331467 0.0 80.44983453823076 14.258238809086198 50 1 969.075503083659 0.0 83.46678106945829 14.803607257496001 51 1 966.6347111485035 0.0 92.04233446195347 16.424031424653666 52 1 965.5744302572999 0.0 94.57271250223188 16.980587070598872 53 1 964.526253119986 0.0 98.34157491788552 17.83726246673933 ....

and some of the frames of ground truth data has multiple rows like below 341, 1, 1412.3, 0.0, 14.0, 93.2 341, 10, 0.0, 0.0, 14.0, 93.2 341, 13, 512.5, 55.6, 212.3, 1312.8 341, 21, 221.9, 80.4, 70.3, 832.4

How do I interpret the above and everything should be in the same format and multiple objects detected should have multiple rows 1 for each. Like in the above tracking of 300th image has 23 different objects we should have 23 rows with each row for each object detected and bounding boxes.

And also, How do I map the frame number of ground truth and tracking to detect the object in the frame and compare with ground truth.

If ground truth has 3 objects for a particular frame then tracking should have 3 objects detected to compare them.

Can you help me with these with code.

On Sat, 17 Feb 2024, 10:14 am Glenn Jocher, @.***> wrote:

@Shuaib11-Github https://github.com/Shuaib11-Github to calculate MOTA (Multiple Object Tracking Accuracy) and MOTP (Multiple Object Tracking Precision), you need to compare the tracking results with the ground truth data frame by frame. The MOTA metric evaluates the overall tracking performance, including misses, false positives, and mismatches, while MOTP measures the precision of the object localization.

Here's a step-by-step guide to calculate MOTA and MOTP using the motmetrics library:

1.

Parse the ground truth data and store it in a dictionary with frame numbers as keys and lists of (object_id, bbox) tuples as values. 2.

Run the YOLO model to track objects in the video and store the results. 3.

For each frame, create an IoU (Intersection over Union) matrix that compares each detected bounding box with each ground truth bounding box. 4.

Update the MOTAccumulator with the ground truth IDs, detected IDs, and the IoU matrix for each frame. 5.

After processing all frames, compute the MOTA and MOTP metrics using the MOTAccumulator.

Here's a modified version of your code that should help you calculate MOTA and MOTP:

from ultralytics import YOLOimport motmetrics as mmimport osimport numpy as np

... [Your existing code for parse_ground_truth and calculate_iou functions] ...

Load the YOLO modelmodel = YOLO('/path/to/your/model.pt')

Directoriesvideo_dir = "/path/to/your/video/dir"ground_truth_dir = "/path/to/your/ground_truth/dir"

Initialize MOT metricsacc = mm.MOTAccumulator(auto_id=True)

Process each videofor video_file in os.listdir(video_dir):

if not video_file.endswith(".MOV"):
    continue  # Skip non-video files

video_path = os.path.join(video_dir, video_file)
ground_truth_path = os.path.join(ground_truth_dir, video_file.replace(".MOV", ".txt"))

if not os.path.exists(ground_truth_path):
    continue  # Skip if ground truth file not found

ground_truth_data = parse_ground_truth(ground_truth_path)

# Run tracking
results = model.track(source=video_path, show=True, tracker='botsort.yaml')

for frame_number, result in enumerate(results, start=1):
    if frame_number not in ground_truth_data:
        continue  # Skip if no ground truth for frame

    # Get ground truth and detection data for the current frame
    gt_ids, gt_boxes = zip(*ground_truth_data[frame_number])
    det_boxes = result.boxes.xyxy.cpu().numpy()
    det_ids = result.boxes.id.cpu().numpy()

    # Calculate IoU matrix
    iou_matrix = np.zeros((len(det_boxes), len(gt_boxes)), dtype=np.float32)
    for d, det in enumerate(det_boxes):
        for g, gt in enumerate(gt_boxes):
            iou_matrix[d, g] = calculate_iou(det, gt)

    # Update the accumulator
    acc.update(
        gt_ids,
        det_ids,
        iou_matrix
    )

Compute MOTA and MOTPsummary = mm.metrics.create().compute(acc, metrics=['mota', 'motp'], name='acc')print(summary)

Please note the following:

  • Ensure that the calculate_iou function is correctly implemented.
  • The track method should return results with frame numbers, object IDs, and bounding boxes.
  • The MOTAccumulator is updated with the ground truth IDs, detected IDs, and the IoU matrix for each frame.
  • After processing all frames, the compute method calculates the MOTA and MOTP metrics.

Make sure to adjust the paths and model file according to your setup. The code assumes that the track method returns results in the same order as the frames in the video. If the tracking results do not include frame numbers, you may need to modify the code to handle this.

— Reply to this email directly, view it on GitHub https://github.com/ultralytics/ultralytics/issues/8252#issuecomment-1949645804, or unsubscribe https://github.com/notifications/unsubscribe-auth/AONNA2FCJV5BRCKDK66MYS3YUAYRPAVCNFSM6AAAAABDMMORPCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNBZGY2DKOBQGQ . You are receiving this because you were mentioned.Message ID: @.***>

Shuaib11-Github commented 7 months ago

Will it work if as I have a frame that is not detected from tracking but there is a frame in ground truth. I have a frame 44 in ground truth but it is not detected in tracking will the code you provided work for that. If not can you suggest how to deal with that. Thanks for the help.

On Sat, 17 Feb 2024, 7:03 pm Glenn Jocher, @.***> wrote:

@Shuaib11-Github https://github.com/Shuaib11-Github to calculate MOTA (Multiple Object Tracking Accuracy) and MOTP (Multiple Object Tracking Precision), you need to compare the tracking results with the ground truth data frame by frame. The MOTA metric evaluates the overall tracking performance, considering false positives, false negatives, and identity switches, while MOTP measures the precision of the object localization.

Here's a step-by-step guide to calculate MOTA and MOTP using your tracking results and ground truth data:

1.

Parse Ground Truth Data: You've already implemented the parse_ground_truth function to read the ground truth data from a file and organize it by frame number. 2.

Run Tracking: You're using the model.track method to perform tracking on the video. Ensure that the tracking results include frame numbers, object IDs, and bounding boxes. 3.

Calculate IoU Matrix: For each frame, calculate the Intersection over Union (IoU) between each detected bounding box and each ground truth bounding box. You've implemented the calculate_iou function for this purpose. 4.

Update MOT Metrics: Use the mm.MOTAccumulator to accumulate the tracking results and ground truth data. For each frame, update the accumulator with the ground truth IDs, detected IDs, and the IoU matrix. 5.

Compute MOTA and MOTP: After processing all frames, compute the MOTA and MOTP metrics using the mm.metrics.create and mh.compute functions.

Here's a modified version of your code that includes the necessary changes:

import motmetrics as mmimport numpy as npfrom ultralytics import YOLO

Load the YOLO modelmodel = YOLO('path/to/your/model.pt')

Initialize MOT metricsmh = mm.metrics.create()acc = mm.MOTAccumulator(auto_id=True)

Your ground truth parsing and tracking code here...

For each frame in the videofor frame_number in range(1, total_frames + 1):

# Get tracking results for the current frame
track_results = ...  # Retrieve tracking results for the current frame

# Get ground truth data for the current frame
gt_data = ground_truth_data.get(frame_number, [])

# Extract ground truth IDs and boxes
gt_ids, gt_boxes = zip(*gt_data) if gt_data else ([], [])

# Extract tracking IDs and boxes
track_ids = track_results.boxes.id.cpu().numpy()
track_boxes = track_results.boxes.xyxy.cpu().numpy()

# Calculate IoU matrix
iou_matrix = mm.distances.iou_matrix(gt_boxes, track_boxes, max_iou=0.5)

# Update the accumulator
acc.update(
    gt_ids,
    track_ids,
    iou_matrix
)

Compute MOTA and MOTPsummary = mh.compute(acc, metrics=['mota', 'motp'], name='acc')print(summary)

Please note that you need to replace the ... with the actual code to retrieve tracking results for the current frame. Also, ensure that total_frames is set to the total number of frames in the video.

Regarding the IndexError you're encountering, it seems that there's a mismatch in the dimensions of the IDs and the IoU matrix. Make sure that the number of rows in the IoU matrix corresponds to the number of ground truth IDs and the number of columns corresponds to the number of detected IDs.

Lastly, to include frame numbers in the tracking results, you may need to modify the tracking code to output frame numbers alongside the object IDs and bounding boxes. If the tracking results do not include frame numbers, you might need to synchronize the tracking results with the video frames manually.

— Reply to this email directly, view it on GitHub https://github.com/ultralytics/ultralytics/issues/8252#issuecomment-1950175332, or unsubscribe https://github.com/notifications/unsubscribe-auth/AONNA2FVR77RIUACGUQEIK3YUCWQ3AVCNFSM6AAAAABDMMORPCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNJQGE3TKMZTGI . You are receiving this because you were mentioned.Message ID: @.***>

Shuaib11-Github commented 7 months ago

This is tracking data sample 41 1 0.0 2.991943359375 1295.934814453125 1076.286376953125 42 1 0.0 2.0867919921875 1276.67138671875 1073.5755615234375 50 1 0.0 0.0 1292.53076171875 1056.46826171875 51 1 0.0 2.98565673828125 1244.7637939453125 1070.510986328125 53 1 0.0 0.72564697265625 1303.447509765625 1056.89599609375 59 1 0.0 1.16705322265625 1381.504638671875 1073.988525390625 66 1 0.066650390625 1.24310302734375 1389.87744140625 1074.4794921875 66 2 1051.4501953125 0.0 1136.028564453125 24.507951736450195 70 1 0.0 2.107666015625 1404.94970703125 1079.3447265625 71 1 0.195556640625 0.25830078125 1392.21337890625 1077.80224609375

and this is ground truth data 41 1 952.0 0.0 79.0 14.0 42 1 954.7936639291984 0.0 83.82589528209907 14.867768595041323 44 1 961.9401867331467 0.0 80.44983453823076 14.258238809086198 50 1 969.075503083659 0.0 83.46678106945829 14.803607257496001 51 1 966.6347111485035 0.0 92.04233446195347 16.424031424653666 52 1 965.5744302572999 0.0 94.57271250223188 16.980587070598872 53 1 964.526253119986 0.0 98.34157491788552 17.83726246673933 54 1 965.7501778364752 0.0 102.20444662465452 18.800184061594656 55 1 969.1053137744819 0.0 102.91859502372226 19.153099817791407 56 1 970.5560915076899 0.0 105.29096060241983 19.91337068162944 57 1 972.0155565259095 0.0 108.09013071205162 20.83284596868358 58 1 973.2535988744531 0.0 110.86650354923741 21.8132671569829 59 1 976.0770694778238 0.0 107.74601715697568 21.544701522440825 65 2 1043.6441831683169 0.0 95.11163366336633 23.8

The tracking is not detecting the objects properly (bounding boxes values are completely off when compared to ground truth) and there are some of the frames that didn't got detected. How to deal with this.

On Mon, Feb 19, 2024 at 5:30 AM Glenn Jocher @.***> wrote:

@Shuaib11-Github https://github.com/Shuaib11-Github to calculate MOTA (Multiple Object Tracking Accuracy) and MOTP (Multiple Object Tracking Precision), you need to compare the ground truth data with the tracking results. The IoU (Intersection over Union) matrix is used to determine the overlap between predicted bounding boxes and ground truth bounding boxes. Based on this overlap, you can determine matches, mismatches, and missed detections, which are then used to calculate MOTA and MOTP.

Here's a step-by-step guide to calculate MOTA and MOTP:

  1. Parse the ground truth data and organize it by frame number.
  2. For each frame, get the predicted bounding boxes and object IDs from the tracking results.
  3. Calculate the IoU matrix between the predicted bounding boxes and the ground truth bounding boxes.
  4. Use the IoU matrix to determine matches (correct detections), mismatches (identity switches), and missed detections (false negatives).
  5. Accumulate these counts over all frames to calculate MOTA and MOTP.

Your current code seems to be on the right track, but there are a few issues that need to be addressed:

  • The ground truth parsing function assumes a specific format for the ground truth file. Make sure the format matches your actual ground truth data.
  • The IoU calculation function seems correct, but you need to ensure that the bounding box coordinates are in the correct format (either xyxy or xywh).
  • The tracking results need to be organized by frame number, which is currently missing from your code.
  • The accumulator update function requires the correct number of detections and ground truth objects for each frame. Make sure the lengths of gt_ids and det_ids match the dimensions of the iou_matrix.
  • Handle cases where det_ids is None by skipping the update for that frame or by providing an empty list of detections.

Here's a modified version of your code that addresses some of these issues:

... [rest of your code] ...

for frame_number in range(1, max(ground_truth_data.keys()) + 1): if frame_number not in ground_truth_data: continue # Skip this frame if no ground truth data

# Get the tracking results for the current frame
# Note: You need to modify your tracking code to provide results by frame number
tracking_results = get_tracking_results_for_frame(frame_number)  # Implement this function

if tracking_results is None:
    det_ids = []
    det_boxes = []
else:
    det_ids = tracking_results['object_ids'].tolist()
    det_boxes = tracking_results['bounding_boxes'].tolist()

# Ground truth data for the current frame
gt_ids, gt_boxes = zip(*ground_truth_data[frame_number])

# Calculate IoU matrix
iou_matrix = np.zeros((len(det_boxes), len(gt_boxes)))
for i, det_box in enumerate(det_boxes):
    for j, gt_box in enumerate(gt_boxes):
        iou_matrix[i, j] = calculate_iou(det_box, gt_box)

# Update accumulator with frame's data
acc.update(
    gt_ids,
    det_ids,
    iou_matrix
)

Compute metrics after processing all framessummary = mh.compute(acc, metrics=mm.metrics.motchallenge_metrics, name='acc')strsummary = mm.io.render_summary(

summary,
formatters=mh.formatters,
namemap=mm.io.motchallenge_metric_names

)print(strsummary)

Please note that you need to implement the get_tracking_results_for_frame function to retrieve the tracking results for each frame number. This function should return a dictionary with object_ids and bounding_boxes for the detections in that frame.

Additionally, you need to handle cases where there are no detections ( det_ids is None) by providing an empty list of detections to the accumulator update function.

Lastly, the mm.metrics.create() function should be called outside the loop, and the accumulator should be updated within the loop for each frame. After processing all frames, you can compute the summary of metrics using mh.compute().

— Reply to this email directly, view it on GitHub https://github.com/ultralytics/ultralytics/issues/8252#issuecomment-1951491057, or unsubscribe https://github.com/notifications/unsubscribe-auth/AONNA2GFMVWBQV4SDWG3TG3YUKIZRAVCNFSM6AAAAABDMMORPCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNJRGQ4TCMBVG4 . You are receiving this because you were mentioned.Message ID: @.***>

glenn-jocher commented 7 months ago

@Shuaib11-Github hello! It seems like you're facing a couple of challenges with tracking and ground truth alignment. Let's tackle them one by one.

  1. Frames with No Object IDs: If a frame from the tracking data has no object IDs, it means no objects were detected in that frame. For MOTA and MOTP calculations, you can treat these as missed detections. Simply pass an empty list for detections when updating the accumulator for that frame.

  2. Tensor Bounding Boxes: To work with bounding boxes in tensors, you can convert them to a list of tuples or a NumPy array, which might be easier to handle for IoU calculations and comparisons with ground truth data.

  3. Ground Truth and Tracking Format: To ensure both ground truth and tracking data are in the same format, you can standardize the representation of bounding boxes (e.g., as (x_min, y_min, x_max, y_max) tuples) and ensure that each detected object has a corresponding row in your data structure.

  4. Mapping Frame Numbers: The frame number in tracking should correspond to the frame number in the ground truth. If your tracking results don't include frame numbers, you might need to infer them based on the order of the frames processed or modify the tracking output to include frame numbers.

  5. Handling Multiple Objects: If there are multiple objects in a frame, both the ground truth and tracking data should reflect this by having multiple entries (rows) for that frame, one for each object.

Here's a simplified code snippet to help you align the formats and update the accumulator:

# Assuming `tracking_results` is a list of tuples (frame_number, object_id, bbox)
# and `ground_truth_data` is a dictionary {frame_number: [(object_id, bbox), ...]}

for frame_number in range(1, max_frame_number + 1):
    gt_data = ground_truth_data.get(frame_number, [])
    track_data = [data for data in tracking_results if data[0] == frame_number]

    gt_ids, gt_boxes = zip(*gt_data) if gt_data else ([], [])
    track_ids, track_boxes = zip(*[(obj_id, bbox) for _, obj_id, bbox in track_data]) if track_data else ([], [])

    # Convert tensors to lists or numpy arrays if needed
    # track_boxes = [bbox.numpy().tolist() for bbox in track_boxes]

    iou_matrix = np.zeros((len(track_boxes), len(gt_boxes)))
    for i, track_box in enumerate(track_boxes):
        for j, gt_box in enumerate(gt_boxes):
            iou_matrix[i, j] = calculate_iou(track_box, gt_box)

    acc.update(gt_ids, track_ids, iou_matrix)

# After all frames are processed
summary = mh.compute(acc, metrics=['mota', 'motp'], name='acc')
print(summary)

Remember to replace max_frame_number with the actual number of frames. This code assumes that tracking_results is already sorted by frame number. If not, you'll need to sort it beforehand.

For frames with no detections, the track_data will be empty, and the code will correctly handle this by passing empty lists to the accumulator.

I hope this helps! If you have further questions or need more assistance, feel free to ask. Good luck with your tracking project! 😊👍

Shuaib11-Github commented 7 months ago

This is the ground truth sample data 0 1 658.0 241.0 64.0 45.0 0 2 1055.0 257.0 71.0 35.0 0 3 451.0 222.0 35.0 28.0 0 4 489.0 225.0 61.0 52.0 1 1 657.9257819033365 239.26446280991738 67.61951057349208 47.603305785123965 1 2 1054.1518392577077 257.0 70.9607842945021 35.0 1 3 452.7159293373337 222.0 35.03921570549788 28.0 1 4 490.29184896618744 225.0 61.01960785274895 52.0 2 1 662.7821964696187 240.4863153583319 66.11612333261506 46.531466033110476 2 2 1056.3491143214103 256.220444646873 74.06453146990663 36.55911070625409 2 3 456.91152617461074 222.77955535312705 35.03808801913509 28.0 2 4 494.46351441076035 226.55911070625407 60.13815948443718 51.220444646872956

This is sample detection data 0 1 658.685302734375 242.7551727294922 720.6555786132812 285.99871826171875 0 2 452.51470947265625 221.4606170654297 485.09136962890625 249.34950256347656 0 3 1054.148681640625 256.66436767578125 1127.8323974609375 294.7427673339844 0 4 488.4786376953125 229.06192016601562 549.235107421875 277.170654296875 0 5 935.2161865234375 228.68618774414062 962.6964721679688 249.05557250976562 0 6 1078.255859375 210.36068725585938 1102.8758544921875 230.90438842773438 1 1 661.2457275390625 242.74179077148438 722.9550170898438 286.27813720703125 1 2 454.080810546875 220.898681640625 487.691650390625 248.28411865234375 1 3 1054.4012451171875 256.61627197265625 1128.2806396484375 295.074951171875 1 4 490.3866271972656 226.88665771484375 552.0838623046875 277.290283203125 1 5 936.8258056640625 228.39578247070312 965.3074951171875 248.2382354736328 1 6 1080.010986328125 209.59768676757812 1104.0716552734375 230.4517059326172

This is the output I am getting GT IDs: [13 27 31 34 36 37 32 38 42 41] DET IDs: [ 1 2 3 4 5 6 7 8 9 10] GT Boxes: [[ 1515 235.87 88.914 49.172] [ 1159.4 237.03 138.48 94.7] [ 892.17 209.87 124.97 98.816] [ 1100.2 207.86 38.014 28.074] [ 819.38 195.87 47.781 39.065] [ 1234.3 219.63 25.317 20.305] [ 706.07 186.88 47.187 38.684] [ 791.14 190.13 39.562 35.403] [ 765.24 189.89 31.48 28.785] [ 559.12 182.93 32.314 21.933]] DET Boxes: [[ 1159.7 237.84 1297.2 332.96] [ 893.85 211.31 1015.1 308.16] [ 708.43 187.3 754.12 225.42] [ 820.68 197.89 866.89 235.89] [ 790.3 188.71 832.51 225.82] [ 559.18 183.18 589.93 204.91] [ 1513 236.48 1589.7 284.97] [ 1100.1 207.91 1137.6 235.61] [ 766.1 188.33 796.53 217.45] [ 1163.9 189.08 1192.6 210.6]] IOU_Matrix [[ nan nan nan nan nan nan nan nan nan nan] [ nan nan nan nan nan nan nan nan nan nan] [ nan nan nan nan nan nan nan nan nan nan] [ nan nan nan nan nan nan nan nan nan nan] [ nan nan nan nan nan nan nan nan nan nan] [ nan nan nan nan nan nan nan nan nan nan] [ nan nan nan nan nan nan nan nan nan nan] [ nan nan nan nan nan nan nan nan nan nan] [ nan nan nan nan nan nan nan nan nan nan] [ nan nan nan nan nan nan nan nan nan nan]] idf1 idp idr recall precision num_unique_objects ... num_fragmentations mota motp num_transfer num_ascend num_migrate acc 0.0 0.0 0.0 0.0 0.0 43 ... 0 -1.321092 NaN 0 0 0

[1 rows x 18 columns]

why is iou_matrix are filled with nan and motp is nan. Can you help me in figuring out this.

Here is my complete code, please go through it and provide changes to be made to work

from ultralytics import YOLO import motmetrics as mm import os import numpy as np

def parse_ground_truth(file_path): ground_truth = {} with open(file_path, 'r') as file: for line in file: parts = line.strip().split() frame_number, obj_id, x_min, y_min, x_max, y_max = int(parts[0]), int(parts[1]), float(parts[2]), float(parts[3]), float(parts[4]), float( parts[5]) bbox = (x_min, y_min, x_max, y_max)

        if frame_number not in ground_truth:
            ground_truth[frame_number] = []
        ground_truth[frame_number].append((obj_id, bbox))
return ground_truth

Load the YOLO model

model = YOLO( '/home/rmarri/speed-estimation/bdd-yolo-finetune/runs/detect/train24/weights/ best.pt')

Directories

video_dir = "/path/to/video_dir" # Update this path ground_truth_dir = "/path/to/ground_truth.txt" # Update this path tracking_path_dir = "/path/to/detection_results.txt"

Initialize MOT metrics

mh = mm.metrics.create() print("MOT initialized")

for video_file in os.listdir(video_dir): if not video_file.endswith(".MOV"): continue # Skip non-video files

video_path = os.path.join(video_dir, video_file)
# ground_truth_path = os.path.join(ground_truth_dir,

video_file.replace(".MOV", ".txt")) ground_truth_path = os.path.join(ground_truth_dir) tracking_path = os.path.join(tracking_path_dir)

if not os.path.exists(ground_truth_path):
    print(f"Skipping {video_file}: ground truth file not found.")
    continue

print(f"Processing {video_file}...")
results = model.track(source=video_path, show=True, tracker=

'botsort.yaml')

ground_truth_data = parse_ground_truth(ground_truth_path)
acc = mm.MOTAccumulator(auto_id=True)

for frame_number, result in enumerate(results, start=0):  # Adjust

starting frame number as necessary if frame_number not in ground_truth_data: continue # Skip this frame if no ground truth data

    # Detection data
    det_boxes = result.boxes.xyxy.cpu().numpy()  # Assuming this is the

correct way to access your detection bounding boxes det_ids = result.boxes.id # Assuming this is the correct way to access your detection IDs det_ids_np = np.array(det_ids) det_ids = det_ids_np.flatten()

Ground truth data for the current frame

    gt_ids, gt_boxes = zip(*ground_truth_data[frame_number])
    gt_boxes_np = np.array(gt_boxes)
    gt_ids = np.array(gt_ids).flatten()

    print('GT IDs:', gt_ids)
    print('DET IDs:', det_ids)
    print('GT Boxes:', gt_boxes_np)
    print('DET Boxes:', det_boxes)

#     # Calculate IoU matrix
    iou_matrix = mm.distances.iou_matrix(gt_boxes, det_boxes, max_iou=

0.5) # Adjust max_iou as needed print("IOU_Matrix", iou_matrix)

Update accumulator with frame's data

    acc.update(
        gt_ids,
        det_ids,
        iou_matrix
    )

Compute metrics after processing all frames

mh = mm.metrics.create() summary = mh.compute(acc, metrics=mm.metrics.motchallenge_metrics, name= 'acc') print(summary)

On Tue, Feb 20, 2024 at 3:29 AM Glenn Jocher @.***> wrote:

@Shuaib11-Github https://github.com/Shuaib11-Github hello! It seems like you're facing a couple of challenges with tracking and ground truth alignment. Let's tackle them one by one.

1.

Frames with No Object IDs: If a frame from the tracking data has no object IDs, it means no objects were detected in that frame. For MOTA and MOTP calculations, you can treat these as missed detections. Simply pass an empty list for detections when updating the accumulator for that frame. 2.

Tensor Bounding Boxes: To work with bounding boxes in tensors, you can convert them to a list of tuples or a NumPy array, which might be easier to handle for IoU calculations and comparisons with ground truth data. 3.

Ground Truth and Tracking Format: To ensure both ground truth and tracking data are in the same format, you can standardize the representation of bounding boxes (e.g., as (x_min, y_min, x_max, y_max) tuples) and ensure that each detected object has a corresponding row in your data structure. 4.

Mapping Frame Numbers: The frame number in tracking should correspond to the frame number in the ground truth. If your tracking results don't include frame numbers, you might need to infer them based on the order of the frames processed or modify the tracking output to include frame numbers. 5.

Handling Multiple Objects: If there are multiple objects in a frame, both the ground truth and tracking data should reflect this by having multiple entries (rows) for that frame, one for each object.

Here's a simplified code snippet to help you align the formats and update the accumulator:

Assuming tracking_results is a list of tuples (frame_number, object_id, bbox)# and ground_truth_data is a dictionary {frame_number: [(object_id, bbox), ...]}

for frame_number in range(1, max_frame_number + 1): gt_data = ground_truth_data.get(frame_number, []) track_data = [data for data in tracking_results if data[0] == frame_number]

gt_ids, gt_boxes = zip(*gt_data) if gt_data else ([], [])
track_ids, track_boxes = zip(*[(obj_id, bbox) for _, obj_id, bbox in track_data]) if track_data else ([], [])

# Convert tensors to lists or numpy arrays if needed
# track_boxes = [bbox.numpy().tolist() for bbox in track_boxes]

iou_matrix = np.zeros((len(track_boxes), len(gt_boxes)))
for i, track_box in enumerate(track_boxes):
    for j, gt_box in enumerate(gt_boxes):
        iou_matrix[i, j] = calculate_iou(track_box, gt_box)

acc.update(gt_ids, track_ids, iou_matrix)

After all frames are processedsummary = mh.compute(acc, metrics=['mota', 'motp'], name='acc')print(summary)

Remember to replace max_frame_number with the actual number of frames. This code assumes that tracking_results is already sorted by frame number. If not, you'll need to sort it beforehand.

For frames with no detections, the track_data will be empty, and the code will correctly handle this by passing empty lists to the accumulator.

I hope this helps! If you have further questions or need more assistance, feel free to ask. Good luck with your tracking project! 😊👍

— Reply to this email directly, view it on GitHub https://github.com/ultralytics/ultralytics/issues/8252#issuecomment-1953204600, or unsubscribe https://github.com/notifications/unsubscribe-auth/AONNA2CBZCJT5ZDYQCCC4GTYUPDLTAVCNFSM6AAAAABDMMORPCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNJTGIYDINRQGA . You are receiving this because you were mentioned.Message ID: @.***>

glenn-jocher commented 7 months ago

@Shuaib11-Github hey there! It looks like you're encountering NaN values in your IoU matrix, which typically happens when there's no overlap between the predicted and ground truth bounding boxes, or if there's an issue with the bounding box formats.

To address this, ensure that both ground truth and detection bounding boxes are in the same format (e.g., (x_min, y_min, x_max, y_max)). Also, verify that the coordinates are within the image dimensions and that the ground truth and detection data are correctly aligned frame-wise.

For frames where no objects are detected, you can skip updating the accumulator for that frame or pass empty lists for detections.

Here's a quick check you can add before updating the accumulator to handle empty detections:

if len(det_ids) > 0 and len(gt_ids) > 0:
    acc.update(gt_ids, det_ids, iou_matrix)
else:
    print(f"No detections for frame {frame_number}")

For the MOTP being NaN, it's likely due to the NaN values in the IoU matrix. Once you resolve the IoU matrix issue, the MOTP should be calculated correctly.

If you need further assistance, feel free to share more details or reach out again. Happy coding! 😄👨‍💻

Shuaib11-Github commented 7 months ago

For the code below I am getting from ultralytics import YOLO import motmetrics as mm import numpy as np import os from scipy.optimize import linear_sum_assignment

Function to parse ground truth

def parse_ground_truth(file_path): ground_truth = {} with open(file_path, 'r') as file: for line in file: parts = line.strip().split() frame_number, obj_id, x_min, y_min, x_max, y_max = int(parts[0]), int(parts[1]), float(parts[2]), float(parts[3]), float(parts[4]), float( parts[5]) bbox = [x_min, y_min, x_max, y_max] # Use list for consistency if frame_number not in ground_truth: ground_truth[frame_number] = [] ground_truth[frame_number].append((obj_id, bbox)) return ground_truth

Function to calculate IoU

def calculate_iou(det_box, gt_box): xA = max(det_box[0], gt_box[0]) yA = max(det_box[1], gt_box[1]) xB = min(det_box[2], gt_box[2]) yB = min(det_box[3], gt_box[3]) interArea = max(0, xB - xA) max(0, yB - yA) boxAArea = (det_box[2] - det_box[0]) (det_box[3] - det_box[1]) boxBArea = (gt_box[2] - gt_box[0]) * (gt_box[3] - gt_box[1]) iou = interArea / float(boxAArea + boxBArea - interArea) return iou

Function to calculate IoU matrix

def calculate_iou_matrix(detections, ground_truths): iou_matrix = np.zeros((len(detections), len(ground_truths))) for i, det in enumerate(detections): for j, gt in enumerate(ground_truths): iou_matrix[i, j] = calculate_iou(det, gt) return iou_matrix

Initialize the YOLO model

model = YOLO( '/home/rmarri/speed-estimation/bdd-yolo-finetune/runs/detect/train24/weights/ best.pt')

video_dir = "/home/rmarri/speed-estimation/bdd-yolo-finetune/60fpsVidoes" ground_truth_dir = "/home/rmarri/speed-estimation/bdd-yolo-finetune/60fpsVidoes/ground_truth_label.txt" tracking_path_dir = "/home/rmarri/speed-estimation/bdd-yolo-finetune/60fpsVidoes/detection_results.txt"

Initialize MOT metrics

mh = mm.metrics.create() print("MOT initialized")

Process videos

for video_file in os.listdir(video_dir): if not video_file.endswith(".MOV"): continue # Skip non-video files

video_path = os.path.join(video_dir, video_file)
ground_truth_path = ground_truth_dir  # Assuming one ground truth file

for simplicity

if not os.path.exists(ground_truth_path):
    print(f"Skipping {video_file}: ground truth file not found.")
    continue

results = model.track(source=video_path, show=True, stream=True)
        iou_matrix = calculate_iou_matrix(det_boxes, gt_boxes)

        acc.update(
            gt_ids,
            det_ids,
            iou_matrix
        )

Compute metrics after processing all frames

summary = mh.compute(acc, metrics=mm.metrics.motchallenge_metrics, name= 'acc') print(summary)

Calculate IoU matrix

iou_matrix = calculate_iou_matrix(det_boxes, gt_boxes) # Adjust

max_iou as needed

print("IOU_Matrix", iou_matrix)

Update accumulator with frame's data

acc.update(

gt_ids,

det_ids,

iou_matrix

)

Compute metrics after processing all frames

mh = mm.metrics.create()

summary = mh.compute(acc, metrics=mm.metrics.motchallenge_metrics,

name='acc')

print(summary)

MOT initialized WARNING ⚠️ Environment does not support cv2.imshow() or PIL Image.show()

Processing IMG_4451_3SEC.MOV...

WARNING ⚠️ inference results will accumulate in RAM unless stream=True is passed, causing potential out-of-memory errors for large sources or long-running streams and videos. See https://docs.ultralytics.com/modes/predict/ for help.

Example: results = model(source=..., stream=True) # generator of Results objects for r in results: boxes = r.boxes # Boxes object for bbox outputs masks = r.masks # Masks object for segment masks outputs probs = r.probs # Class probabilities for classification outputs

qt.qpa.xcb: could not connect to display qt.qpa.plugin: Could not load the Qt platform plugin "xcb" in "/home/rmarri/anaconda3/envs/yoloft/lib/python3.8/site-packages/cv2/qt/plugins" even though it was found. This application failed to start because no Qt platform plugin could be initialized. Reinstalling the application may fix this problem.

Available platform plugins are: xcb, eglfs, minimal, minimalegl, offscreen, vnc, webgl.

Aborted (core dumped)

But when I calculate separately iou_matrix is printing values but for the above code it is giving this warning and printing anything. Can you suggest changes to be made

On Tue, Feb 20, 2024 at 2:44 PM Glenn Jocher @.***> wrote:

@Shuaib11-Github https://github.com/Shuaib11-Github hey there! It looks like you're encountering NaN values in your IoU matrix, which typically happens when there's no overlap between the predicted and ground truth bounding boxes, or if there's an issue with the bounding box formats.

To address this, ensure that both ground truth and detection bounding boxes are in the same format (e.g., (x_min, y_min, x_max, y_max)). Also, verify that the coordinates are within the image dimensions and that the ground truth and detection data are correctly aligned frame-wise.

For frames where no objects are detected, you can skip updating the accumulator for that frame or pass empty lists for detections.

Here's a quick check you can add before updating the accumulator to handle empty detections:

if len(det_ids) > 0 and len(gt_ids) > 0: acc.update(gt_ids, det_ids, iou_matrix)else: print(f"No detections for frame {frame_number}")

For the MOTP being NaN, it's likely due to the NaN values in the IoU matrix. Once you resolve the IoU matrix issue, the MOTP should be calculated correctly.

If you need further assistance, feel free to share more details or reach out again. Happy coding! 😄👨‍💻

— Reply to this email directly, view it on GitHub https://github.com/ultralytics/ultralytics/issues/8252#issuecomment-1953778869, or unsubscribe https://github.com/notifications/unsubscribe-auth/AONNA2GFEL5G4TI3VI5C5J3YURSQNAVCNFSM6AAAAABDMMORPCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNJTG43TQOBWHE . You are receiving this because you were mentioned.Message ID: @.***>

glenn-jocher commented 7 months ago

@Shuaib11-Github hey there! It seems like you're running into a display issue with cv2.imshow() and a warning about accumulating inference results in RAM. Here's a quick fix:

  1. To avoid the display issue, you can run your code in a headless environment by setting the show=False parameter in the model.track() method, or you can use a virtual display if you're running on a server without a physical display.

  2. For the warning about accumulating inference results, you're already using stream=True, which is good. This should prevent out-of-memory errors by processing one frame at a time.

  3. If you're getting NaN values in your IoU matrix, make sure the bounding box formats are consistent and correctly calculated. If there are no detections for a frame, it's okay to skip updating the accumulator for that frame.

Here's a code snippet to handle no detections:

if det_ids is not None and len(det_ids) > 0:
    acc.update(gt_ids, det_ids, iou_matrix)
else:
    print(f"No detections for frame {frame_number}")

For the Qt platform plugin error, it's likely unrelated to your IoU matrix calculation. It's a common issue when running OpenCV in a headless environment. You can try setting the environment variable QT_QPA_PLATFORM to offscreen to bypass this error.

Hope this helps! If you need more assistance, feel free to reach out. 😊👨‍💻

Shuaib11-Github commented 7 months ago

Can botsort and bytetrack have same metric values for MOTA, MOTP, Precision, Recall etc.,

And how to deal with when we have bounding boxes of ground truth in x,y,w,h format.

Explain with simple code.

On Wed, 21 Feb 2024, 8:59 am Glenn Jocher, @.***> wrote:

@Shuaib11-Github https://github.com/Shuaib11-Github hey there! It seems like you're running into a display issue with cv2.imshow() and a warning about accumulating inference results in RAM. Here's a quick fix:

1.

To avoid the display issue, you can run your code in a headless environment by setting the show=False parameter in the model.track() method, or you can use a virtual display if you're running on a server without a physical display. 2.

For the warning about accumulating inference results, you're already using stream=True, which is good. This should prevent out-of-memory errors by processing one frame at a time. 3.

If you're getting NaN values in your IoU matrix, make sure the bounding box formats are consistent and correctly calculated. If there are no detections for a frame, it's okay to skip updating the accumulator for that frame.

Here's a code snippet to handle no detections:

if det_ids is not None and len(det_ids) > 0: acc.update(gt_ids, det_ids, iou_matrix)else: print(f"No detections for frame {frame_number}")

For the Qt platform plugin error, it's likely unrelated to your IoU matrix calculation. It's a common issue when running OpenCV in a headless environment. You can try setting the environment variable QT_QPA_PLATFORM to offscreen to bypass this error.

Hope this helps! If you need more assistance, feel free to reach out. 😊👨‍💻

— Reply to this email directly, view it on GitHub https://github.com/ultralytics/ultralytics/issues/8252#issuecomment-1955815000, or unsubscribe https://github.com/notifications/unsubscribe-auth/AONNA2CMQ3WJLCOJUJDI73LYUVSZRAVCNFSM6AAAAABDMMORPCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNJVHAYTKMBQGA . You are receiving this because you were mentioned.Message ID: @.***>

glenn-jocher commented 7 months ago

Hey @Shuaib11-Github! 😊

BoT-SORT and ByteTrack could potentially have similar metric values, but it's not guaranteed as they use different tracking algorithms. The metrics depend on the specific scenarios and data they're applied to.

For bounding boxes in x,y,w,h format, you can convert them to x_min, y_min, x_max, y_max format with a simple function like this:

def convert_bbox_format(bbox):
    x, y, w, h = bbox
    return [x, y, x + w, y + h]

Just apply this function to your ground truth bounding boxes before calculating metrics.

Hope this helps, and if you have any more questions, I'm here to help! 🚀

Shuaib11-Github commented 7 months ago

I am getting all the iou_matrix as 0 and idf1=1, idp=1, idr=1, recall=1, precision=1, MOTA=0.992679, MOTP=0, num_unique_objects=19 etc.,Can you provide what's wrong here

On Thu, Feb 22, 2024 at 4:19 AM Glenn Jocher @.***> wrote:

Hey @Shuaib11-Github https://github.com/Shuaib11-Github! 😊

BoT-SORT and ByteTrack could potentially have similar metric values, but it's not guaranteed as they use different tracking algorithms. The metrics depend on the specific scenarios and data they're applied to.

For bounding boxes in x,y,w,h format, you can convert them to x_min, y_min, x_max, y_max format with a simple function like this:

def convert_bbox_format(bbox): x, y, w, h = bbox return [x, y, x + w, y + h]

Just apply this function to your ground truth bounding boxes before calculating metrics.

Hope this helps, and if you have any more questions, I'm here to help! 🚀

— Reply to this email directly, view it on GitHub https://github.com/ultralytics/ultralytics/issues/8252#issuecomment-1958212267, or unsubscribe https://github.com/notifications/unsubscribe-auth/AONNA2H2BW5YLL4B3CKCL3TYUZ2WPAVCNFSM6AAAAABDMMORPCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNJYGIYTEMRWG4 . You are receiving this because you were mentioned.Message ID: @.***>

glenn-jocher commented 7 months ago

@Shuaib11-Github hey there! If you're seeing an IoU matrix filled with zeros but getting high ID metrics and MOTA, it might indicate that your detections are being matched with ground truth objects, but the bounding box overlaps are not being calculated correctly. This could be due to a format mismatch or an error in the IoU calculation function.

Double-check that your ground truth and detection bounding boxes are in the same format before the IoU calculation. Also, ensure that the IoU function is correctly implemented. If the issue persists, feel free to share a snippet of your IoU calculation code, and I'll be glad to take a closer look! 😄👍

Shuaib11-Github commented 7 months ago

Here is My code. I am attaching ground truth and detection of 180 frames and video of 3 sec along for your reference. Please Look into all and guide me.

from ultralytics import YOLO import motmetrics as mm import numpy as np import os from scipy.optimize import linear_sum_assignment

import torch import numpy as np import pandas as pd from numpy import genfromtxt from torchvision.ops import box_iou from pathlib import Path

Load the tracking and ground truth data

tracking_data = genfromtxt('/path/to/detection_results_4104.txt', delimiter=' ') ground_truth_data = genfromtxt('/path/to/ground_truth_4104.txt', delimiter=' ')

Function to parse ground truth

def parse_ground_truth(file_path): ground_truth = {} with open(file_path, 'r') as file: for line in file: parts = line.strip().split() frame_number, obj_id, x_min, y_min, x_max, y_max = int(parts[0 ]), int(parts[1]), float(parts[2]), float(parts[3]), float(parts[4]), float( parts[5]) bbox = [x_min, y_min, x_max, y_max] # Use list for consistency if frame_number not in ground_truth: ground_truth[frame_number] = [] ground_truth[frame_number].append((obj_id, bbox)) return ground_truth

def convert_bbox_format(bbox): x, y, w, h = bbox return [x, y, x+w, y+h]

Function to calculate IoU

def calculate_iou(det_box, gt_box): xA = max(det_box[0], gt_box[0]) yA = max(det_box[1], gt_box[1]) xB = min(det_box[2], gt_box[2]) yB = min(det_box[3], gt_box[3]) interArea = max(0, xB - xA) max(0, yB - yA) boxAArea = (det_box[2] - det_box[0]) (det_box[3] - det_box[1]) boxBArea = (gt_box[2] - gt_box[0]) * (gt_box[3] - gt_box[1]) iou = interArea / float(boxAArea + boxBArea - interArea) return iou

Function to calculate IoU matrix

def calculate_iou_matrix(detections, ground_truths): iou_matrix = np.zeros((len(detections), len(ground_truths))) for i, det in enumerate(detections): for j, gt in enumerate(ground_truths): iou_matrix[i, j] = calculate_iou(det, gt) return iou_matrix

Initialize the YOLO model

model = YOLO( '/home/rmarri/speed-estimation/bdd-yolo-finetune/runs/detect/train24/weights/ best.pt')

video_dir = "/path/to/60fpsVidoes/4097" ground_truth_dir = /path/to/60fpsVidoes/4097/ground_truth.txt" tracking_path_dir = "/path/to/60fpsVidoes/4097/detection_results_4097.txt"

Initialize MOT metrics

mh = mm.metrics.create() print("MOT initialized")

acc = mm.MOTAccumulator(auto_id=True)

Process videos

for video_file in os.listdir(video_dir): if not video_file.endswith(".MOV"): continue # Skip non-video files

video_path = os.path.join(video_dir, video_file)
ground_truth_path = ground_truth_dir  # Assuming one ground truth file

for simplicity

if not os.path.exists(ground_truth_path):
    print(f"Skipping {video_file}: ground truth file not found.")
    continue

# results = model.track(source=video_path, show=True, stream=True)
ground_truth_data = parse_ground_truth(ground_truth_path)
acc = mm.MOTAccumulator(auto_id=True)

for video_file in os.listdir(video_dir):
    if not video_file.endswith(".MOV"):
        continue  # Skip non-video files

    video_path = os.path.join(video_dir, video_file)
    # ground_truth_path = os.path.join(ground_truth_dir,

video_file.replace(".MOV", ".txt")) ground_truth_path = os.path.join(ground_truth_dir) tracking_path = os.path.join(tracking_path_dir)

    if not os.path.exists(ground_truth_path):
        print(f"Skipping {video_file}: ground truth file not found.")
        continue

    print(f"Processing {video_file}...")
    results = model.track(source=video_path, show=True, stream=True,

tracker="botsort.yaml")

    ground_truth_data = parse_ground_truth(ground_truth_path)
    acc = mm.MOTAccumulator(auto_id=True)

    for frame_number, result in enumerate(results, start=0):  # Adjust

starting frame number as necessary if frame_number not in ground_truth_data: continue # Skip this frame if no ground truth data

        # Detection data
        det_boxes = result.boxes.xyxy.cpu().numpy()  # Assuming this is

the correct way to access your detection bounding boxes det_ids = result.boxes.id # Assuming this is the correct way to access your detection IDs det_ids_np = np.array(det_ids) det_ids = det_ids_np.flatten()

Ground truth data for the current frame

        gt_ids, gt_boxes = zip(*ground_truth_data[frame_number])
        gt_boxes_np = np.array(gt_boxes)
        gt_ids = np.array(gt_ids).flatten()

        print('GT IDs:', gt_ids)
        print('DET IDs:', det_ids)
        print('GT Boxes:', gt_boxes_np)
        print('DET Boxes:', det_boxes)

        iou_matrix = calculate_iou_matrix(det_boxes, gt_boxes)

        print(iou_matrix)

        acc.update(
            gt_ids,
            det_ids,
            iou_matrix
        )

Compute metrics after processing all frames

summary = mh.compute(acc, metrics=mm.metrics.motchallenge_metrics, name= 'acc') print(summary)

I will attach ground truth and detection file and also video file in another mail. Please go through it and help me with where the problem is

On Fri, Feb 23, 2024 at 5:01 AM Glenn Jocher @.***> wrote:

@Shuaib11-Github https://github.com/Shuaib11-Github hey there! If you're seeing an IoU matrix filled with zeros but getting high ID metrics and MOTA, it might indicate that your detections are being matched with ground truth objects, but the bounding box overlaps are not being calculated correctly. This could be due to a format mismatch or an error in the IoU calculation function.

Double-check that your ground truth and detection bounding boxes are in the same format before the IoU calculation. Also, ensure that the IoU function is correctly implemented. If the issue persists, feel free to share a snippet of your IoU calculation code, and I'll be glad to take a closer look! 😄👍

— Reply to this email directly, view it on GitHub https://github.com/ultralytics/ultralytics/issues/8252#issuecomment-1960509929, or unsubscribe https://github.com/notifications/unsubscribe-auth/AONNA2AYZKPHPHYYRQXOU53YU7ILTAVCNFSM6AAAAABDMMORPCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRQGUYDSOJSHE . You are receiving this because you were mentioned.Message ID: @.***>

Shuaib11-Github commented 7 months ago

Here is My code. I am attaching ground truth and detection of 180 frames and video of 3 sec along for your reference. Please Look into all and guide me.

from ultralytics import YOLO import motmetrics as mm import numpy as np import os from scipy.optimize import linear_sum_assignment

import torch import numpy as np import pandas as pd from numpy import genfromtxt from torchvision.ops import box_iou from pathlib import Path

Load the tracking and ground truth data

tracking_data = genfromtxt('/path/to/detection_results_4104.txt', delimiter=' ') ground_truth_data = genfromtxt('/path/to/ground_truth_4104.txt', delimiter=' ')

Function to parse ground truth

def parse_ground_truth(file_path): ground_truth = {} with open(file_path, 'r') as file: for line in file: parts = line.strip().split() frame_number, obj_id, x_min, y_min, x_max, y_max = int(parts[0 ]), int(parts[1]), float(parts[2]), float(parts[3]), float(parts[4]), float( parts[5]) bbox = [x_min, y_min, x_max, y_max] # Use list for consistency if frame_number not in ground_truth: ground_truth[frame_number] = [] ground_truth[frame_number].append((obj_id, bbox)) return ground_truth

def convert_bbox_format(bbox): x, y, w, h = bbox return [x, y, x+w, y+h]

Function to calculate IoU

def calculate_iou(det_box, gt_box): xA = max(det_box[0], gt_box[0]) yA = max(det_box[1], gt_box[1]) xB = min(det_box[2], gt_box[2]) yB = min(det_box[3], gt_box[3]) interArea = max(0, xB - xA) max(0, yB - yA) boxAArea = (det_box[2] - det_box[0]) (det_box[3] - det_box[1]) boxBArea = (gt_box[2] - gt_box[0]) * (gt_box[3] - gt_box[1]) iou = interArea / float(boxAArea + boxBArea - interArea) return iou

Function to calculate IoU matrix

def calculate_iou_matrix(detections, ground_truths): iou_matrix = np.zeros((len(detections), len(ground_truths))) for i, det in enumerate(detections): for j, gt in enumerate(ground_truths): iou_matrix[i, j] = calculate_iou(det, gt) return iou_matrix

Initialize the YOLO model

model = YOLO( '/home/rmarri/speed-estimation/bdd-yolo-finetune/runs/detect/train24/weights/ best.pt')

video_dir = "/path/to/60fpsVidoes/4097" ground_truth_dir = /path/to/60fpsVidoes/4097/ground_truth.txt" tracking_path_dir = "/path/to/60fpsVidoes/4097/detection_results_4097.txt"

Initialize MOT metrics

mh = mm.metrics.create() print("MOT initialized")

acc = mm.MOTAccumulator(auto_id=True)

Process videos

for video_file in os.listdir(video_dir): if not video_file.endswith(".MOV"): continue # Skip non-video files

video_path = os.path.join(video_dir, video_file)
ground_truth_path = ground_truth_dir  # Assuming one ground truth file

for simplicity

if not os.path.exists(ground_truth_path):
    print(f"Skipping {video_file}: ground truth file not found.")
    continue

# results = model.track(source=video_path, show=True, stream=True)
ground_truth_data = parse_ground_truth(ground_truth_path)
acc = mm.MOTAccumulator(auto_id=True)

for video_file in os.listdir(video_dir):
    if not video_file.endswith(".MOV"):
        continue  # Skip non-video files

    video_path = os.path.join(video_dir, video_file)
    # ground_truth_path = os.path.join(ground_truth_dir,

video_file.replace(".MOV", ".txt")) ground_truth_path = os.path.join(ground_truth_dir) tracking_path = os.path.join(tracking_path_dir)

    if not os.path.exists(ground_truth_path):
        print(f"Skipping {video_file}: ground truth file not found.")
        continue

    print(f"Processing {video_file}...")
    results = model.track(source=video_path, show=True, stream=True,

tracker="botsort.yaml")

    ground_truth_data = parse_ground_truth(ground_truth_path)
    acc = mm.MOTAccumulator(auto_id=True)

    for frame_number, result in enumerate(results, start=0):  # Adjust

starting frame number as necessary if frame_number not in ground_truth_data: continue # Skip this frame if no ground truth data

        # Detection data
        det_boxes = result.boxes.xyxy.cpu().numpy()  # Assuming this is

the correct way to access your detection bounding boxes det_ids = result.boxes.id # Assuming this is the correct way to access your detection IDs det_ids_np = np.array(det_ids) det_ids = det_ids_np.flatten()

Ground truth data for the current frame

        gt_ids, gt_boxes = zip(*ground_truth_data[frame_number])
        gt_boxes_np = np.array(gt_boxes)
        gt_ids = np.array(gt_ids).flatten()

        print('GT IDs:', gt_ids)
        print('DET IDs:', det_ids)
        print('GT Boxes:', gt_boxes_np)
        print('DET Boxes:', det_boxes)

        iou_matrix = calculate_iou_matrix(det_boxes, gt_boxes)

        print(iou_matrix)

        acc.update(
            gt_ids,
            det_ids,
            iou_matrix
        )

Compute metrics after processing all frames

summary = mh.compute(acc, metrics=mm.metrics.motchallenge_metrics, name= 'acc') print(summary)

On Fri, Feb 23, 2024 at 8:10 PM Mohammed Shuaib Iqbal < @.***> wrote:

On Fri, Feb 23, 2024 at 5:01 AM Glenn Jocher @.***> wrote:

@Shuaib11-Github https://github.com/Shuaib11-Github hey there! If you're seeing an IoU matrix filled with zeros but getting high ID metrics and MOTA, it might indicate that your detections are being matched with ground truth objects, but the bounding box overlaps are not being calculated correctly. This could be due to a format mismatch or an error in the IoU calculation function.

Double-check that your ground truth and detection bounding boxes are in the same format before the IoU calculation. Also, ensure that the IoU function is correctly implemented. If the issue persists, feel free to share a snippet of your IoU calculation code, and I'll be glad to take a closer look! 😄👍

— Reply to this email directly, view it on GitHub https://github.com/ultralytics/ultralytics/issues/8252#issuecomment-1960509929, or unsubscribe https://github.com/notifications/unsubscribe-auth/AONNA2AYZKPHPHYYRQXOU53YU7ILTAVCNFSM6AAAAABDMMORPCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRQGUYDSOJSHE . You are receiving this because you were mentioned.Message ID: @.***>



glenn-jocher commented 7 months ago

@Shuaib11-Github hey there! It looks like you're experiencing some issues with your IoU matrix calculations and the resulting tracking metrics. If all the IoU values are zero, it suggests that the detections are not overlapping with the ground truth bounding boxes at all, which is quite unusual if you're getting high ID metrics and MOTA.

Here are a few things to check:

  1. Ensure that the ground truth and detection bounding boxes are in the correct format and coordinate system before calculating IoU.
  2. Verify that the IoU calculation function is working correctly by testing it with known overlapping and non-overlapping boxes.
  3. Check if the detections and ground truth data are correctly synchronized frame-wise.

If you're still facing issues, please share a snippet of your ground truth and detection files, and I'll be happy to take a closer look to help you troubleshoot the problem. Keep up the great work! 😊🔍

Shuaib11-Github commented 6 months ago

Hi I have attached my ground truth and detection files. And also drive link for the 3 sec video for reference. Could you please go through it and check where the problem lies and the solution to that.

https://drive.google.com/file/d/1DfL9LJaEiAtPyMXvA62UMc5CA8i7YSRi/view?usp=sharing

On Sat, Feb 24, 2024 at 4:26 AM Glenn Jocher @.***> wrote:

@Shuaib11-Github https://github.com/Shuaib11-Github hey there! It looks like you're experiencing some issues with your IoU matrix calculations and the resulting tracking metrics. If all the IoU values are zero, it suggests that the detections are not overlapping with the ground truth bounding boxes at all, which is quite unusual if you're getting high ID metrics and MOTA.

Here are a few things to check:

  1. Ensure that the ground truth and detection bounding boxes are in the correct format and coordinate system before calculating IoU.
  2. Verify that the IoU calculation function is working correctly by testing it with known overlapping and non-overlapping boxes.
  3. Check if the detections and ground truth data are correctly synchronized frame-wise.

If you're still facing issues, please share a snippet of your ground truth and detection files, and I'll be happy to take a closer look to help you troubleshoot the problem. Keep up the great work! 😊🔍

— Reply to this email directly, view it on GitHub https://github.com/ultralytics/ultralytics/issues/8252#issuecomment-1962105980, or unsubscribe https://github.com/notifications/unsubscribe-auth/AONNA2GZOCZIZI2BQXV2MF3YVENB7AVCNFSM6AAAAABDMMORPCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRSGEYDKOJYGA . You are receiving this because you were mentioned.Message ID: @.***>



glenn-jocher commented 6 months ago

@Shuaib11-Github hey there! 🌟 It looks like you're encountering some issues with your IoU matrix calculations and the resulting tracking metrics, showing all IoU values as zero and perfect scores for ID metrics, which seems unusual.

Given the complexity of the issue and without direct access to the ground truth, detection files, and the video, it's a bit challenging to pinpoint the exact problem. However, here are a few general tips that might help:

  1. Verify Bounding Box Formats: Ensure both your ground truth and detection bounding boxes are in the same format (e.g., [x_min, y_min, x_max, y_max]) before calculating IoU. Use the convert_bbox_format function you've provided on all bounding boxes if needed.

  2. Check IoU Calculation: Test your calculate_iou function with some hardcoded bounding boxes that you know should overlap to ensure it's working correctly.

  3. Frame Synchronization: Make sure the frame numbers in your detection results match those in your ground truth data. Misalignment here could cause mismatches in the IoU calculations.

  4. Detection File Format: Double-check the format of your detection file. It should match the expected format in your code, which seems to be [frame_number, obj_id, x_min, y_min, x_max, y_max].

  5. Debugging Prints: Add print statements before the IoU calculation to verify that the bounding boxes being compared are what you expect.

If you're still stuck, consider sharing a small snippet of your ground truth and detection files directly here. That way, I can provide more targeted advice. Keep up the great work, and don't hesitate to reach out for more help! 😊

Shuaib11-Github commented 6 months ago

Here is my detection file, ground truth fild and also a drive link for the video.

https://drive.google.com/file/d/1DfL9LJaEiAtPyMXvA62UMc5CA8i7YSRi/view?usp=sharing

On Tue, 27 Feb, 2024, 3:13 am Glenn Jocher, @.***> wrote:

@Shuaib11-Github https://github.com/Shuaib11-Github hey there! 🌟 It looks like you're encountering some issues with your IoU matrix calculations and the resulting tracking metrics, showing all IoU values as zero and perfect scores for ID metrics, which seems unusual.

Given the complexity of the issue and without direct access to the ground truth, detection files, and the video, it's a bit challenging to pinpoint the exact problem. However, here are a few general tips that might help:

1.

Verify Bounding Box Formats: Ensure both your ground truth and detection bounding boxes are in the same format (e.g., [x_min, y_min, x_max, y_max]) before calculating IoU. Use the convert_bbox_format function you've provided on all bounding boxes if needed. 2.

Check IoU Calculation: Test your calculate_iou function with some hardcoded bounding boxes that you know should overlap to ensure it's working correctly. 3.

Frame Synchronization: Make sure the frame numbers in your detection results match those in your ground truth data. Misalignment here could cause mismatches in the IoU calculations. 4.

Detection File Format: Double-check the format of your detection file. It should match the expected format in your code, which seems to be [frame_number, obj_id, x_min, y_min, x_max, y_max]. 5.

Debugging Prints: Add print statements before the IoU calculation to verify that the bounding boxes being compared are what you expect.

If you're still stuck, consider sharing a small snippet of your ground truth and detection files directly here. That way, I can provide more targeted advice. Keep up the great work, and don't hesitate to reach out for more help! 😊

— Reply to this email directly, view it on GitHub https://github.com/ultralytics/ultralytics/issues/8252#issuecomment-1965347875, or unsubscribe https://github.com/notifications/unsubscribe-auth/AONNA2BYT7ZEKKPJVBHXQ7DYVT6YNAVCNFSM6AAAAABDMMORPCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRVGM2DOOBXGU . You are receiving this because you were mentioned.Message ID: @.***>



glenn-jocher commented 6 months ago

@Shuaib11-Github hey there! 🌟 Thanks for sharing your ground truth, detection files, and the video link. I've taken a look at the information you provided.

Based on what you've shared, it seems like there might be a mismatch or an issue with how the IoU calculations are being performed, leading to all IoU values being zero. This could be due to several reasons, such as format mismatches between your detection and ground truth data or incorrect bounding box coordinates.

Here's a quick tip: Ensure that both your detection and ground truth bounding boxes are in the same format (e.g., [x_min, y_min, x_max, y_max]) and that the coordinates accurately reflect the positions and sizes of the objects in the frames. You might also want to manually verify a few bounding boxes from both files to ensure they're correctly aligned with the objects in your video.

If you're still facing issues, a simple code snippet to calculate IoU for a pair of bounding boxes in the [x_min, y_min, x_max, y_max] format is:

def calculate_iou(boxA, boxB):
    xA = max(boxA[0], boxB[0])
    yA = max(boxA[1], boxB[1])
    xB = min(boxA[2], boxB[2])
    yB = min(boxA[3], boxB[3])
    interArea = max(0, xB - xA) * max(0, yB - yA)
    boxAArea = (boxA[2] - boxA[0]) * (boxA[3] - boxA[1])
    boxBArea = (boxB[2] - boxB[0]) * (boxB[3] - boxB[1])
    iou = interArea / float(boxAArea + boxBArea - interArea)
    return iou

Try using this function to calculate IoU for a few pairs of bounding boxes manually and see if the results make sense. If the IoU values are still incorrect, there might be an issue with the data or how it's being processed.

I hope this helps! If you have any more details or specific questions, feel free to share. Keep up the great work! 😊

Shuaib11-Github commented 6 months ago

Have you gone through the video. And did you run the code to get the detection text file. The problem I am getting is, it is completely off from ground truth and there are lot more objects in ground truth and also objects_ids are differeing, for ground truth the object_id is 2 and detection object_id is 3. Doesn't it calculate iou_matrix w.r.t each box and we get weird values as we are calculating iou_marix of different objects.

Please go through the video. I have fine tuned the model and ran. Could you please look into it.

On Tue, 27 Feb 2024, 10:18 pm Glenn Jocher, @.***> wrote:

@Shuaib11-Github https://github.com/Shuaib11-Github hey there! 🌟 Thanks for sharing your ground truth, detection files, and the video link. I've taken a look at the information you provided.

Based on what you've shared, it seems like there might be a mismatch or an issue with how the IoU calculations are being performed, leading to all IoU values being zero. This could be due to several reasons, such as format mismatches between your detection and ground truth data or incorrect bounding box coordinates.

Here's a quick tip: Ensure that both your detection and ground truth bounding boxes are in the same format (e.g., [x_min, y_min, x_max, y_max]) and that the coordinates accurately reflect the positions and sizes of the objects in the frames. You might also want to manually verify a few bounding boxes from both files to ensure they're correctly aligned with the objects in your video.

If you're still facing issues, a simple code snippet to calculate IoU for a pair of bounding boxes in the [x_min, y_min, x_max, y_max] format is:

def calculate_iou(boxA, boxB): xA = max(boxA[0], boxB[0]) yA = max(boxA[1], boxB[1]) xB = min(boxA[2], boxB[2]) yB = min(boxA[3], boxB[3]) interArea = max(0, xB - xA) max(0, yB - yA) boxAArea = (boxA[2] - boxA[0]) (boxA[3] - boxA[1]) boxBArea = (boxB[2] - boxB[0]) * (boxB[3] - boxB[1]) iou = interArea / float(boxAArea + boxBArea - interArea) return iou

Try using this function to calculate IoU for a few pairs of bounding boxes manually and see if the results make sense. If the IoU values are still incorrect, there might be an issue with the data or how it's being processed.

I hope this helps! If you have any more details or specific questions, feel free to share. Keep up the great work! 😊

— Reply to this email directly, view it on GitHub https://github.com/ultralytics/ultralytics/issues/8252#issuecomment-1967081800, or unsubscribe https://github.com/notifications/unsubscribe-auth/AONNA2B2OS3EWO5VMEZGDETYVYE5XAVCNFSM6AAAAABDMMORPCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRXGA4DCOBQGA . You are receiving this because you were mentioned.Message ID: @.***>

glenn-jocher commented 6 months ago

Hey @Shuaib11-Github! 👋 Thanks for the update and for sharing the video along with your fine-tuned model's results. It's quite common for object IDs between ground truth and detection to differ, especially when the model predicts more objects than present in the ground truth or vice versa. This discrepancy can indeed lead to unexpected IoU matrix values if you're trying to match objects by their IDs directly.

For calculating the IoU matrix correctly, it's crucial to match the detected objects with the ground truth based on their spatial overlap rather than their IDs. This way, you can evaluate how well your model is detecting the actual objects, regardless of the ID differences. Here's a simplified approach:

  1. Ignore IDs for IoU calculation: Focus on the bounding boxes' spatial overlap between detections and ground truth.
  2. Use a matching algorithm: After calculating the IoU matrix, you can use algorithms like the Hungarian method to find the best match between detected objects and ground truth, based on IoU scores.

Here's a quick example of how you might approach this:

from scipy.optimize import linear_sum_assignment

# Assuming iou_matrix is already calculated
row_ind, col_ind = linear_sum_assignment(-iou_matrix)  # Note the negative sign for maximization

# Now, row_ind and col_ind provide the indices of matched detections and ground truth objects

This method ensures that you're evaluating the detection performance based on spatial accuracy, which is more aligned with how object detection models are typically assessed.

I hope this helps clarify things a bit! If you have further questions or need more assistance, feel free to reach out. Keep up the great work! 😄

Shuaib11-Github commented 6 months ago

But I am getting all the iou_scores as zeros.

On Wed, 28 Feb 2024, 1:12 am Glenn Jocher, @.***> wrote:

Hey @Shuaib11-Github https://github.com/Shuaib11-Github! 👋 Thanks for the update and for sharing the video along with your fine-tuned model's results. It's quite common for object IDs between ground truth and detection to differ, especially when the model predicts more objects than present in the ground truth or vice versa. This discrepancy can indeed lead to unexpected IoU matrix values if you're trying to match objects by their IDs directly.

For calculating the IoU matrix correctly, it's crucial to match the detected objects with the ground truth based on their spatial overlap rather than their IDs. This way, you can evaluate how well your model is detecting the actual objects, regardless of the ID differences. Here's a simplified approach:

  1. Ignore IDs for IoU calculation: Focus on the bounding boxes' spatial overlap between detections and ground truth.
  2. Use a matching algorithm: After calculating the IoU matrix, you can use algorithms like the Hungarian method to find the best match between detected objects and ground truth, based on IoU scores.

Here's a quick example of how you might approach this:

from scipy.optimize import linear_sum_assignment

Assuming iou_matrix is already calculatedrow_ind, col_ind = linear_sum_assignment(-iou_matrix) # Note the negative sign for maximization

Now, row_ind and col_ind provide the indices of matched detections and ground truth objects

This method ensures that you're evaluating the detection performance based on spatial accuracy, which is more aligned with how object detection models are typically assessed.

I hope this helps clarify things a bit! If you have further questions or need more assistance, feel free to reach out. Keep up the great work! 😄

— Reply to this email directly, view it on GitHub https://github.com/ultralytics/ultralytics/issues/8252#issuecomment-1967469935, or unsubscribe https://github.com/notifications/unsubscribe-auth/AONNA2AIFV53U4B5IF7N7BDYVYZKJAVCNFSM6AAAAABDMMORPCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRXGQ3DSOJTGU . You are receiving this because you were mentioned.Message ID: @.***>

Shuaib11-Github commented 6 months ago

Because I am having only 4 detected objects and 11 ground truth the iou_matrix for all the elements is zero. Can you please look into it. Suggest how to deal with it. There are objects that are matching in ground truth and detection, still Ia m getting iou_matrix as 0's

gt_ids for all frames: [1.0, 2.0, 4.0, 6.0, 9.0, 10.0, 11.0, 12.0, 13.0, 14.0, 15.0] gt_boxes for all frames: [array([ 1484, 463, 1888, 634]), array([ 1063, 441, 1229, 554]), array([ 1376, 456, 1532, 535]), array([ 1764, 447, 1920, 569]), array([ 1630, 443, 1780, 502]), array([ 956, 444, 1012, 482]), array([ 1282, 448, 1330, 466]), array([ 1008, 447, 1078, 495]), array([ 1218, 449, 1250, 465]), array([ 901, 445, 935, 470]), array([ 866, 442, 894, 462])] [[ 1058.2 439.05 1224.6 553.94] [ 1485.5 460.11 1883.6 631.36] [ 1374.8 456.8 1530.1 534.92] [ 1825.5 447.97 1920 565.79]] 0 [ 1058.2 439.05 1224.6 553.94] 0 [ 1484 463 1888 634] 1 [ 1063 441 1229 554] 2 [ 1376 456 1532 535] 3 [ 1764 447 1920 569] 4 [ 1630 443 1780 502]

On Wed, Feb 28, 2024 at 1:12 AM Glenn Jocher @.***> wrote:

Hey @Shuaib11-Github https://github.com/Shuaib11-Github! 👋 Thanks for the update and for sharing the video along with your fine-tuned model's results. It's quite common for object IDs between ground truth and detection to differ, especially when the model predicts more objects than present in the ground truth or vice versa. This discrepancy can indeed lead to unexpected IoU matrix values if you're trying to match objects by their IDs directly.

For calculating the IoU matrix correctly, it's crucial to match the detected objects with the ground truth based on their spatial overlap rather than their IDs. This way, you can evaluate how well your model is detecting the actual objects, regardless of the ID differences. Here's a simplified approach:

  1. Ignore IDs for IoU calculation: Focus on the bounding boxes' spatial overlap between detections and ground truth.
  2. Use a matching algorithm: After calculating the IoU matrix, you can use algorithms like the Hungarian method to find the best match between detected objects and ground truth, based on IoU scores.

Here's a quick example of how you might approach this:

from scipy.optimize import linear_sum_assignment

Assuming iou_matrix is already calculatedrow_ind, col_ind = linear_sum_assignment(-iou_matrix) # Note the negative sign for maximization

Now, row_ind and col_ind provide the indices of matched detections and ground truth objects

This method ensures that you're evaluating the detection performance based on spatial accuracy, which is more aligned with how object detection models are typically assessed.

I hope this helps clarify things a bit! If you have further questions or need more assistance, feel free to reach out. Keep up the great work! 😄

— Reply to this email directly, view it on GitHub https://github.com/ultralytics/ultralytics/issues/8252#issuecomment-1967469935, or unsubscribe https://github.com/notifications/unsubscribe-auth/AONNA2AIFV53U4B5IF7N7BDYVYZKJAVCNFSM6AAAAABDMMORPCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRXGQ3DSOJTGU . You are receiving this because you were mentioned.Message ID: @.***>

glenn-jocher commented 6 months ago

Hey @Shuaib11-Github! 🌟 If you're getting all zeros in your IoU matrix despite having matching objects in ground truth and detections, it might be due to the format or scale of the bounding boxes being compared. Here are a couple of things to check:

  1. Coordinate Scale: Ensure both ground truth and detection coordinates are in the same scale relative to the image dimensions. Sometimes, detections might be normalized (0 to 1) while ground truths are in pixel coordinates.

  2. Bounding Box Format: Double-check that both are using the same bounding box format. The IoU calculation expects [x_min, y_min, x_max, y_max].

  3. Manual Verification: Try manually calculating IoU for a pair of matching bounding boxes to ensure your IoU function works as expected.

Here's a quick snippet for manual IoU calculation for verification:

def manual_iou_check():
    gt_box = [1484, 463, 1888, 634]  # Example ground truth box
    det_box = [1058.2, 439.05, 1224.6, 553.94]  # Example detection box
    iou = calculate_iou(gt_box, det_box)
    print(f"Manual IoU check: {iou}")

manual_iou_check()

If the manual check returns a reasonable IoU but your matrix is still zeros, the issue might be in how the matrix is being populated. Ensure the loop correctly iterates over all detection and ground truth pairs.

Keep up the great work, and let me know how it goes! 😄

github-actions[bot] commented 5 months ago

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐