leggedrobotics / darknet_ros

YOLO ROS: Real-Time Object Detection for ROS
BSD 3-Clause "New" or "Revised" License
2.19k stars 1.18k forks source link

YOLO Bounding Box coordinates variables clarification #360

Open TeingQiHao opened 2 years ago

TeingQiHao commented 2 years ago

Dear all,

I am trying to getting bounding box coordinates of the YOLO detected boxes.

I am currently making my own minor changes on YoloObjectDetector.cpp and YoloObjectDetector.hpp to get the information.

I identified the bounding box coordinates to the following function in YoloObjectDetector.cpp: ... void* YoloObjectDetector::detectInThread() { running_ = 1; float nms = .4;

layer l = net->layers[net->n - 1]; float X = buffLetter[(buffIndex + 2) % 3].data; float prediction = networkpredict(net, X);

rememberNetwork(net); detection* dets = 0; int nboxes = 0; dets = avgPredictions(net, &nboxes);

if (nms > 0) do_nms_obj(dets, nboxes, l.classes, nms);

if (enableConsoleOutput) { printf("\033[2J"); printf("\033[1;1H"); printf("\nFPS:%.1f\n", fps); printf("Objects:\n\n"); } image display = buff[(buffIndex + 2) % 3]; drawdetections(display, dets, nboxes, demoThresh, demoNames, demoAlphabet, demoClasses_);

// extract the bounding boxes and send them to ROS int i, j; int count = 0; for (i = 0; i < nboxes; ++i) { float xmin = (dets[i].bbox.x - dets[i].bbox.w / 2.) 0.95; float xmax = (dets[i].bbox.x + dets[i].bbox.w / 2.)1.05; float ymin = (dets[i].bbox.y - dets[i].bbox.h / 2.)0.95; float ymax = (dets[i].bbox.y + dets[i].bbox.h / 2.)1.05;

if (xmin < 0) xmin = 0;
if (ymin < 0) ymin = 0;
if (xmax > 1) xmax = 1;
if (ymax > 1) ymax = 1;

//modified: convert to pixel coordinates int x1, y1, x2, y2, bboxMid; x1 = xmin 1280; y1 = ymin 720; x2 = xmax 1280; y2 = ymax 720;

//modified: print boxes - 10-11-2021 //printf("extract the bounding boxes and send them to ROS"); //printf("%f %f %f %f\n", xmin, xmax, ymin, ymax); std_msgs::Float32MultiArray coords; coords.data.resize(4); coords.data.push_back(xmin); coords.data.push_back(xmax); coords.data.push_back(ymin); coords.data.push_back(ymax); detectedboundingboxPublisher.publish(coords);

...

As the these variables are less than 1, I inferred that these are coordinates of the boxes in terms of ratio of the image frame.

When printing out, I realised that xmin, xmax, ymin and ymax can be negative.

Does anyone know if I have identified the right variables or my understanding of them is incorrect?

Thanks in advance!