Open saurabh-git-dev opened 1 month ago
@saurabh-git-dev Have you found a solution yet?
@tan199954 Not yet. I've included for you more information about the issues ... https://community.hailo.ai/t/obb-model-quantization-poor-benchmark/5317 https://community.hailo.ai/t/yolob8n-obb-rotated-nms/5048
@saurabh-git-dev I used assistance from ChatGPT, and my code is now working
import numpy as np
import math
REGRESSION_LENGTH = 15
STRIDES = [8, 16, 32]
names = ['plane', 'ship', 'storage tank', 'baseball diamond', 'tennis court', 'basketball court',
'ground track field', 'harbor', 'bridge', 'large vehicle', 'small vehicle',
'helicopter', 'roundabout', 'soccer ball field', 'swimming pool']
def softmax(x):
return np.exp(x) / np.expand_dims(np.sum(np.exp(x), axis=-1), axis=-1)
def sigmoid(x):
return 1 / (1 + np.exp(-x))
def _yolov8_obb_decoding(raw_boxes, angles, strides, image_dims, reg_max):
boxes = None
for box_distribute, stride, angle in zip(raw_boxes, strides, angles):
# create grid
shape = [int(x / stride) for x in image_dims]
grid_x = np.arange(shape[1]) + 0.5
grid_y = np.arange(shape[0]) + 0.5
grid_x, grid_y = np.meshgrid(grid_x, grid_y)
ct_row = grid_y.flatten() * stride
ct_col = grid_x.flatten() * stride
center = np.stack((ct_col, ct_row), axis=1)
# box distribution to distance
reg_range = np.arange(reg_max + 1)
box_distribute = np.reshape(
box_distribute, (-1, box_distribute.shape[1] * box_distribute.shape[2], 4, reg_max + 1)
)
box_distance = softmax(box_distribute)
box_distance = box_distance * np.reshape(reg_range, (1, 1, 1, -1))
box_distance = np.sum(box_distance, axis=-1)
lt = box_distance[...,:2]
rb = box_distance[...,2:]
cos = np.cos(angle)
sin = np.sin(angle)
xf, yf = np.split((rb - lt) / 2, 2, axis=-1)
x = xf * cos - yf * sin
y = xf * sin + yf * cos
xy = np.concatenate([x, y], axis=-1)
xywh_box = np.concatenate([xy, lt + rb], axis=-1) * stride
xywh_box[..., :2] += np.expand_dims(center, axis=0)
boxes = xywh_box if boxes is None else np.concatenate([boxes, xywh_box], axis=1)
return boxes
def generate_yolo_predictions(endnodes):
"""
endnodes is a list of 9 tensors:
endnodes[0]: bbox output with shapes (BS, 20, 20, 64)
endnodes[1]: scores output with shapes (BS, 20, 20, 80)
endnodes[2]: angles output with shapes (BS, 20, 20, 1)
endnodes[3]: bbox output with shapes (BS, 40, 40, 64)
endnodes[4]: scores output with shapes (BS, 40, 40, 80)
endnodes[5]: angles output with shapes (BS, 20, 20, 1)
endnodes[6]: bbox output with shapes (BS, 80, 80, 64)
endnodes[7]: scores output with shapes (BS, 80, 80, 80)
endnodes[8]: angles output with shapes (BS, 20, 20, 1)
Returns:
numpy.ndarray: A concatenated array of shape (BS, total_predictions, 5 + num_classes) where:
- `total_predictions` is the sum of predictions across all scales (20x20, 40x40, 80x80).
- Each prediction contains:
- `4` values for the bounding box coordinates in the format [x, y, w, h].
- `1` value representing the angle of rotation.
- `num_classes` values for the confidence scores for each class.
"""
image_dims = (640, 640)
raw_boxes = endnodes[:7:3]
angles = [np.reshape(s, (-1, s.shape[1] * s.shape[2], 1)) for s in endnodes[2::3]]
angles = [(sigmoid(x) - 0.25) * math.pi for x in angles]
decoded_boxes = _yolov8_obb_decoding(raw_boxes, angles, STRIDES, image_dims, REGRESSION_LENGTH)
scores = [np.reshape(s, (-1, s.shape[1] * s.shape[2], len(names))) for s in endnodes[1:8:3]]
scores = np.concatenate(scores, axis=1)
angles = np.concatenate(angles, axis=1)
return np.concatenate([decoded_boxes, scores, angles], axis=2)
@tan199954 Are you able to post-process and can see rotated detections?
I think you also need to implement the Rotated NMS.
@saurabh-git-dev i'm using the non_max_suppression function from ultralytics with torch cpu. i convert the output of the generate_yolo_predictions function to torch and then transpose to (batch_size, num_classes + 5, num_boxes)
Is there any plan to implement any of the latest yolo obb models in the near future? Mainly Writing post-processing is not easy for everyone. So I can't move forward with that.