HankerSia commented 6 years ago

I am sorry to disturb you again. when I tried to test the single_image_test node, i found that the result of the node recognition is much different with the darknet yolo test example using the same cfg file and weights file. it seems result of yolo is much better, the detail is as following: the single_image_test: rosrun darknet_ros single_image_test init done image:/home/robot/Pictures/car_assemble/00090.jpg image width:1920,image height:1080 Loading network...

layer filters size input output 0 conv 16 3 x 3 / 1 416 x 416 x 3 -> 416 x 416 x 16 1 max 2 x 2 / 2 416 x 416 x 16 -> 208 x 208 x 16 2 conv 32 3 x 3 / 1 208 x 208 x 16 -> 208 x 208 x 32 3 max 2 x 2 / 2 208 x 208 x 32 -> 104 x 104 x 32 4 conv 64 3 x 3 / 1 104 x 104 x 32 -> 104 x 104 x 64 5 max 2 x 2 / 2 104 x 104 x 64 -> 52 x 52 x 64 6 conv 128 3 x 3 / 1 52 x 52 x 64 -> 52 x 52 x 128 7 max 2 x 2 / 2 52 x 52 x 128 -> 26 x 26 x 128 8 conv 256 3 x 3 / 1 26 x 26 x 128 -> 26 x 26 x 256 9 max 2 x 2 / 2 26 x 26 x 256 -> 13 x 13 x 256 10 conv 512 3 x 3 / 1 13 x 13 x 256 -> 13 x 13 x 512 11 max 2 x 2 / 1 13 x 13 x 512 -> 13 x 13 x 512 12 conv 1024 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x1024 13 conv 1024 3 x 3 / 1 13 x 13 x1024 -> 13 x 13 x1024 14 conv 55 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x 55 15 detection mask_scale: Using default '1.000000' Loading weights from /home/robot/catkin_ws/src/darknet_ros/weights/tiny-yolo-voc_final.weights...Done!

FPS:0.0 FPS: 6.62461e-10

Objects: 5

0.288215 0.627939 0.0235804 0.05999 hand 0.0721917 0.32779 0.676095 0.0217603 0.059309 screw_short 0.10264 0.489685 0.682697 0.0360985 0.095257 hand 0.0526488 0.489685 0.682697 0.0360985 0.095257 screw_driver 0.0519082 0.345631 0.724738 0.0234073 0.0420611 screw_short 0.062739

the darknet yolo: ./darknet detector test cfg/voc.data cfg/tiny-yolo-voc.cfg results/tiny-yolo-voc_final.weights data/car_assemble/00090.jpg layer filters size input output 0 conv 16 3 x 3 / 1 416 x 416 x 3 -> 416 x 416 x 16 1 max 2 x 2 / 2 416 x 416 x 16 -> 208 x 208 x 16 2 conv 32 3 x 3 / 1 208 x 208 x 16 -> 208 x 208 x 32 3 max 2 x 2 / 2 208 x 208 x 32 -> 104 x 104 x 32 4 conv 64 3 x 3 / 1 104 x 104 x 32 -> 104 x 104 x 64 5 max 2 x 2 / 2 104 x 104 x 64 -> 52 x 52 x 64 6 conv 128 3 x 3 / 1 52 x 52 x 64 -> 52 x 52 x 128 7 max 2 x 2 / 2 52 x 52 x 128 -> 26 x 26 x 128 8 conv 256 3 x 3 / 1 26 x 26 x 128 -> 26 x 26 x 256 9 max 2 x 2 / 2 26 x 26 x 256 -> 13 x 13 x 256 10 conv 512 3 x 3 / 1 13 x 13 x 256 -> 13 x 13 x 512 11 max 2 x 2 / 1 13 x 13 x 512 -> 13 x 13 x 512 12 conv 1024 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x1024 13 conv 1024 3 x 3 / 1 13 x 13 x1024 -> 13 x 13 x1024 14 conv 55 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x 55 15 detection mask_scale: Using default '1.000000' Loading weights from results/tiny-yolo-voc_final.weights...Done! data/car_assemble/00090.jpg: Predicted in 0.002944 seconds. hand: 69% shell_1: 75% hand: 72% screw_short: 40% screw_long: 74% base: 89% And these recognition rectangles of darknet yolo is much precise than the single image test node on the image when i show the result using opencv imshow function. can you give me some advice about this problem. thank you very much!

pgigioli commented 6 years ago

Can you test single_image_test and darknet detector on the default data/dog.jpg image with the tiny-yolo-voc.weights? Let's compare the results of both of those to what I get.

HankerSia commented 6 years ago

I tested it again as you said. result is as following: rosrun darknet_ros single_image_test /home/robot/catkin_ws/src/darknet_ros/data/dog.jpg init done image:/home/robot/catkin_ws/src/darknet_ros/data/dog.jpg image width:768,image height:576 Loading network...

layer filters size input output 0 conv 16 3 x 3 / 1 416 x 416 x 3 -> 416 x 416 x 16 1 max 2 x 2 / 2 416 x 416 x 16 -> 208 x 208 x 16 2 conv 32 3 x 3 / 1 208 x 208 x 16 -> 208 x 208 x 32 3 max 2 x 2 / 2 208 x 208 x 32 -> 104 x 104 x 32 4 conv 64 3 x 3 / 1 104 x 104 x 32 -> 104 x 104 x 64 5 max 2 x 2 / 2 104 x 104 x 64 -> 52 x 52 x 64 6 conv 128 3 x 3 / 1 52 x 52 x 64 -> 52 x 52 x 128 7 max 2 x 2 / 2 52 x 52 x 128 -> 26 x 26 x 128 8 conv 256 3 x 3 / 1 26 x 26 x 128 -> 26 x 26 x 256 9 max 2 x 2 / 2 26 x 26 x 256 -> 13 x 13 x 256 10 conv 512 3 x 3 / 1 13 x 13 x 256 -> 13 x 13 x 512 11 max 2 x 2 / 1 13 x 13 x 512 -> 13 x 13 x 512 12 conv 1024 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x1024 13 conv 1024 3 x 3 / 1 13 x 13 x1024 -> 13 x 13 x1024 14 conv 125 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x 125 15 detection mask_scale: Using default '1.000000' Loading weights from /home/robot/catkin_ws/src/darknet_ros/weights/tiny-yolo-voc.weights...Done!

FPS:0.0 FPS: 6.62432e-10

Objects: 4

0.649594 0.202339 0.156341 0.146046 car 0.409677 0.738646 0.225125 0.296852 0.199614 car 0.436707 0.284169 0.680926 0.317719 0.555516 dog 0.753152 0.430711 0.584029 0.606548 0.598445 bicycle 0.47687

yolo darknet :

./darknet detector test cfg/voc_backup.data cfg/tiny-yolo-voc_backup.cfg tiny-yolo-voc.weights data/dog.jpg layer filters size input output 0 conv 16 3 x 3 / 1 416 x 416 x 3 -> 416 x 416 x 16 1 max 2 x 2 / 2 416 x 416 x 16 -> 208 x 208 x 16 2 conv 32 3 x 3 / 1 208 x 208 x 16 -> 208 x 208 x 32 3 max 2 x 2 / 2 208 x 208 x 32 -> 104 x 104 x 32 4 conv 64 3 x 3 / 1 104 x 104 x 32 -> 104 x 104 x 64 5 max 2 x 2 / 2 104 x 104 x 64 -> 52 x 52 x 64 6 conv 128 3 x 3 / 1 52 x 52 x 64 -> 52 x 52 x 128 7 max 2 x 2 / 2 52 x 52 x 128 -> 26 x 26 x 128 8 conv 256 3 x 3 / 1 26 x 26 x 128 -> 26 x 26 x 256 9 max 2 x 2 / 2 26 x 26 x 256 -> 13 x 13 x 256 10 conv 512 3 x 3 / 1 13 x 13 x 256 -> 13 x 13 x 512 11 max 2 x 2 / 1 13 x 13 x 512 -> 13 x 13 x 512 12 conv 1024 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x1024 13 conv 1024 3 x 3 / 1 13 x 13 x1024 -> 13 x 13 x1024 14 conv 125 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x 125 15 detection mask_scale: Using default '1.000000' Loading weights from tiny-yolo-voc.weights...Done! data/dog.jpg: Predicted in 0.005586 seconds. car: 34% car: 55% dog: 78% bicycle: 35%

It seems the two result is similiar.

HankerSia commented 6 years ago

But i have tested many times, if i use my own cfg and weights file, the result is really different....

HankerSia commented 6 years ago

the cfg file using in darknet program is as following: [net] batch=64 subdivisions=8 width=416 height=416 channels=3 momentum=0.9 decay=0.0005 angle=0 saturation = 1.5 exposure = 1.5 hue=.1

learning_rate=0.001 max_batches = 80200 policy=steps steps=-1,100,20000,40000 scales=.1,10,.1,.1

[convolutional] batch_normalize=1 filters=16 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=32 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=64 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=128 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=1

[convolutional] batch_normalize=1 filters=1024 size=3 stride=1 pad=1 activation=leaky

###########

[convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=1024 activation=leaky

[convolutional] size=1 stride=1 pad=1 filters=55

filter=num×(classes + coords + 1)=5*(6+4+1)=55

activation=linear

[region] anchors = 1.08,1.19, 3.42,4.41, 6.63,11.38, 9.42,5.11, 16.62,10.52 bias_match=1 classes=6 coords=4 num=5 softmax=1 jitter=.2 rescore=1

object_scale=5 noobject_scale=1 class_scale=1 coord_scale=1

absolute=1 thresh = .6 random=1

the cfg file using in single_image_test is as following: [net] batch=64 subdivisions=8 width=416 height=416 channels=3 momentum=0.9 decay=0.0005 angle=0 saturation = 1.5 exposure = 1.5 hue=.1

learning_rate=0.001 max_batches = 80200 policy=steps steps=-1,100,20000,40000 scales=.1,10,.1,.1

[convolutional] batch_normalize=1 filters=16 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=32 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=64 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=128 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=1

[convolutional] batch_normalize=1 filters=1024 size=3 stride=1 pad=1 activation=leaky

###########

[convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=1024 activation=leaky

[convolutional] size=1 stride=1 pad=1 filters=55

filter=num×(classes + coords + 1)=5*(6+4+1)=55

activation=linear

[region] anchors = 1.08,1.19, 3.42,4.41, 6.63,11.38, 9.42,5.11, 16.62,10.52 bias_match=1 classes=6 coords=4 num=5 softmax=1 jitter=.2 rescore=1

object_scale=5 noobject_scale=1 class_scale=1 coord_scale=1

absolute=1 thresh = .6 random=1

HankerSia commented 6 years ago

Is it the problem of different image preprocessing? because my image is a 1920x1080 RGB color image... Or is it the different test function that the two program really use in my in my situation..

pgigioli commented 6 years ago

Hmm.. that's strange that the results are different. And with dog.jpg, neither of those results look correct. I get: # Objects: 1, 0.356723 0.5 0.642134 1, dog 0.914943

I would either try it with the docker image just to double check or redownloading the yolo weights and cfg files.

HankerSia commented 6 years ago

i will try the solution that you give later... ok, i have modified the source code. and according to the result image of prediction, it can be found that the box of prediction is not precision. like this... 1103_14_41_23 p.s. i removed the image area of my partner...

the source code i modified is as following:

include "yolo_ros.h"

include <ros/ros.h>

include <image_transport/image_transport.h>

include <cv_bridge/cv_bridge.h>

include <sensor_msgs/image_encodings.h>

include <sensor_msgs/Image.h>

include <geometry_msgs/Point.h>

include < vector>

include < iostream>

include

include <std_msgs/Int8.h>

include

include <darknet_ros/bbox_array.h>

include <darknet_ros/bbox.h>

include<opencv2/core/core.hpp>

include<opencv2/highgui/highgui.hpp>

include

using namespace cv;

extern "C" {

include "box.h"

}

// initialize YOLO functions that are called in this script PredBox run_yolo(); void load_net(char cfgfile, char *weightfile, float thresh, float hier);//run_yolo.cpp, row:205 int get_obj_count();

// define demo_yolo inputs char cfg = "/home/robot/catkin_ws/src/darknet_ros/cfg/tiny-yolo-voc.cfg"; char weights = "/home/robot/catkin_ws/src/darknet_ros/weights/tiny-yolo-voc_final.weights"; float thresh = 0.05;

const std::string class_labels[] = { "hand", "screw_driver", "screw_short", "screw_long", "base", "shell_1"}; / const std::string class_labels[] = { "aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow", "dining table", "dog", "horse", "motorbike", "person", "potted plant", "sheep", "sofa", "train", "tv monitor" }; / const int num_classes = sizeof(class_labels)/sizeof(class_labels[0]);

cv::Mat input_frame;

// define parameters int FRAME_W; int FRAME_H; int FRAME_AREA; int FRAME_COUNT = 0;

//std::vector< std::vector > _class_bboxes; std::vector _class_bboxes[num_classes]; int _class_obj_count[num_classes]; cv::Scalar _bbox_colors[num_classes]; darknet_ros::bbox_array _bbox_results_msg; PredBox* _boxes;

char OPENCV_WINDOW="Predictions"; char order[100]; char dir="/home/robot/catkin_ws/src/darknet_ros/"; char img_path[1000];

// define a function that will replace CvVideoCapture. // This function is called in yolo_kernels and allows YOLO to receive the ROS image // message as an IplImage IplImage get_Ipl_image() { IplImage ROS_img = new IplImage(input_frame); return ROS_img; }

void drawBBoxes(cv::Mat &input_frame, std::vector &class_boxes, int &class_obj_count, cv::Scalar &bbox_color, const std::string &class_label) { //darknet_ros::bbox bbox_result;

for (int i = 0; i < class_obj_count; i++) { int xmin = (class_boxes[i].x - class_boxes[i].w/2)FRAME_W; int ymin = (class_boxes[i].y - class_boxes[i].h/2)FRAME_H; int xmax = (class_boxes[i].x + class_boxes[i].w/2)FRAME_W; int ymax = (class_boxes[i].y + class_boxes[i].h/2)FRAME_H;

  // draw bounding box of first object found
  cv::Point topLeftCorner = cv::Point(xmin, ymin);
  cv::Point botRightCorner = cv::Point(xmax, ymax);
  cv::rectangle(input_frame, topLeftCorner, botRightCorner, bbox_color, 2);
  cv::putText(input_frame, class_label, cv::Point(xmin, ymax+15), cv::FONT_HERSHEY_PLAIN,
      1.0, bbox_color, 2.0);
  }

}

void get_detections(cv::Mat &full_frame) { input_frame = full_frame.clone();

// run yolo and get bounding boxes for objects _boxes = run_yolo();

// get the number of bounding boxes found int num = get_obj_count(); int incr = floor(255/num_classes); for (int i = 0; i < num_classes; i++) { _bbox_colors[i] = cv::Scalar(255 - incri, 0 + incri, 255 - incr*i); } // if at least one bbox found, draw box if (num > 0 && num <= 100) { std::cout << "# Objects: " << num << std::endl;

  // split bounding boxes by class
  for (int i = 0; i < num; i++)
  {
     for (int j = 0; j < num_classes; j++)
     {
        if (_boxes[i].Class == j)
        {
           std::cout << _boxes[i].x << " " << _boxes[i].y << " " << _boxes[i].w << " " << _boxes[i].h << std::endl;
           std::cout << class_labels[_boxes[i].Class] << " " << _boxes[i].prob << std::endl;
           _class_bboxes[j].push_back(_boxes[i]);
           _class_obj_count[j]++;
        }
     }
  }
  for (int i = 0; i < num_classes; i++)
  {
     if (_class_obj_count[i] > 0) drawBBoxes(input_frame, _class_bboxes[i],
                         _class_obj_count[i], _bbox_colors[i], class_labels[i]);
  }

}

for (int i = 0; i < num_classes; i++) { _class_bboxes[i].clear(); _class_obj_count[i] = 0; }

char img_path[1000]; time_t tt=time(NULL); tm* t=localtime(&tt);

sprintf(order, "%02d%02d%d%d_%d.jpg",t->tm_mon+1,t->tm_mday,t->tm_hour,t->tm_min,t->tm_sec); strcat(img_path,dir); strcat(img_path,order); imwrite(img_path, input_frame);

namedWindow(OPENCV_WINDOW,0); imshow(OPENCV_WINDOW, input_frame);

waitKey(3); }

//extern "C" int detector_init(char datacfg, char cfgfile, char weightfile, network net, char names, image alphabet); //extern "C" ROS_box* test_detector_ros(network net, image im, char names, image alphabet, float thresh, float hier_thresh);

int main(int argc, char** argv) { ros::init(argc, argv, "ROS_interface");

cv::Mat image; if(1 == argc) { charpath="/home/robot/Pictures/car_assemble/00094.jpg"; //charpath="/home/robot/catkin_ws/src/darknet_ros/data/dog.jpg"; image = imread(path,CV_LOAD_IMAGE_COLOR); namedWindow("Test Picture",0); imshow("Test Picture",image); printf("image:%s\n",path);

} else { image = imread(argv[1], CV_LOAD_IMAGE_COLOR); namedWindow("Load Picture",0); imshow("Load Picture",image); printf("image:%s\n",argv[1]); }

FRAME_W = image.size().width; FRAME_H = image.size().height; printf("image width:%d,image height:%d\n",FRAME_W,FRAME_H); load_net(cfg, weights, thresh, 0.5); get_detections(image);

waitKey(0);

return 0; }

i don't know whether it is the parameter error or other reason... Thank you very much...

pgigioli commented 6 years ago

Does the yolo_ros node work for you at least? I didn't spend too much time on single_image_test so it's possible that I made some changes in yolo_ros that I forgot to update in single_image_test.

HankerSia commented 6 years ago

I have tested the yolo_run node using the darknet official tiny-yolo-voc cfg and weights, it worked well, but i did not analysis the detail...and i did not test it on my own cfg file and weights

pgigioli commented 6 years ago

I would try the official tiny-yolo-voc weights and cfg on the single_image_test with the dog.jpg (maybe do a clean re-clone of the repo) and see if you get the results that I got. Also try your custom weights and cfg file on the yolo_ros node and see if that works. If none of that works, try everything using the Docker image as that will isolate any extraneous configuration bugs.

Did you train your custom weights using the last version of darknet? There might be a discrepancy between the version of darknet that you trained your model on and the version of darknet that is used in darknet_ros. My version should be the latest.

HankerSia commented 6 years ago

yes, i cloned the darknet from the darknet official website not long ago. Beside i once test the yolo_ros using my own cfg and weights file, but it seems the target can be detected but the terminal refreshed quickly, so i did not notice the prediction percent of precision... By the way, is the docker packed the environment that the ros package needed such as cuda 8.0, opencv et al? while my package has worked very well...so do i need use it? I am sorry that i have never used docker before.

while, another simple problem, how to adjust the angle value in the tiny-yolo-voc.cfg, i know it can add the trainning sample by rotating the original image. i used it as following: ... angel=0,20,45,60,90 ... is it right? Thank you!

pgigioli commented 6 years ago

The docker is not required, but it is guaranteed to work. It has cuda 8.0, cudnn 6, ROS kinetic, and ubuntu 16.04.

I'm not sure about the rotation angle, I haven't trained darknet in a while.

HankerSia commented 6 years ago

I had got the usage of angle from the other cfg files, it can be valued as angle=7 and so on, maybe it can not be given a series values. while i think the reason that made the result different may be the different function entrance of the darknet recognition program. we once had developed a similar recognition program, while the main developer had departed, maybe i can send the package to you, and that package does not support cuda when i tried to test it, that is one of reasons that i did not use it...But the result of that program was similar with the darknet program. Besides, i will try to transfer the darknet to ros on the basis of you and my partner.

I have cloned the project. the website address is as following: https://github.com/HankerSia/sia_darknet_ros.git

HankerSia commented 6 years ago

Hello! Did you receive my latest comment...can you give me some advices... By the way, can you add some essential comment to the main c++ file, such as the run_yolo.cpp? i am sorry that i can't read the code very clearly...Especially for the function of detect_objects(), Did network_predict(net, X) function really do recognition work and the X symbolize the image waiting for recognition?

Thank you very much!

pgigioli commented 6 years ago

I would change the angle parameter back to 0 and retest on the dog.jpg. It's possible that darknet_ros can't handle rotations, since it's only meant for inference, whereas darknet can. See if you get the same result that I got above.

I'll try to add some comments to the code soon but I'm pretty busy right now with other things. network_predict() does the recognition and X is a pointer to the image data.

HankerSia commented 6 years ago

Hi! Good afternoon! I have tried to transform the detector.c to your project, and as i considered, after i used the test_detector() function in the single_image_test.cpp file, the result is almost the same as the darknet original result, as following:

darknet:

./darknet detector test cfg/voc.data cfg/tiny-yolo-voc.cfg results/tiny-yolo-voc_final.weights data/car_assemble/00202.jpg layer filters size input output 0 conv 16 3 x 3 / 1 416 x 416 x 3 -> 416 x 416 x 16 1 max 2 x 2 / 2 416 x 416 x 16 -> 208 x 208 x 16 2 conv 32 3 x 3 / 1 208 x 208 x 16 -> 208 x 208 x 32 3 max 2 x 2 / 2 208 x 208 x 32 -> 104 x 104 x 32 4 conv 64 3 x 3 / 1 104 x 104 x 32 -> 104 x 104 x 64 5 max 2 x 2 / 2 104 x 104 x 64 -> 52 x 52 x 64 6 conv 128 3 x 3 / 1 52 x 52 x 64 -> 52 x 52 x 128 7 max 2 x 2 / 2 52 x 52 x 128 -> 26 x 26 x 128 8 conv 256 3 x 3 / 1 26 x 26 x 128 -> 26 x 26 x 256 9 max 2 x 2 / 2 26 x 26 x 256 -> 13 x 13 x 256 10 conv 512 3 x 3 / 1 13 x 13 x 256 -> 13 x 13 x 512 11 max 2 x 2 / 1 13 x 13 x 512 -> 13 x 13 x 512 12 conv 1024 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x1024 13 conv 1024 3 x 3 / 1 13 x 13 x1024 -> 13 x 13 x1024 14 conv 55 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x 55 15 detection mask_scale: Using default '1.000000' Loading weights from results/tiny-yolo-voc_final.weights...Done! data/car_assemble/00202.jpg: Predicted in 0.003443 seconds. hand: 61% base: 71% screw_driver: 43% screw_short: 29% shell_1: 59% screw_short: 67%

darknet_ros:single_image_test

rosrun darknet_ros single_image_test image:/home/robot/Pictures/car_assemble/00202.jpg image width:1920,image height:1080 layer filters size input output 0 conv 16 3 x 3 / 1 416 x 416 x 3 -> 416 x 416 x 16 1 max 2 x 2 / 2 416 x 416 x 16 -> 208 x 208 x 16 2 conv 32 3 x 3 / 1 208 x 208 x 16 -> 208 x 208 x 32 3 max 2 x 2 / 2 208 x 208 x 32 -> 104 x 104 x 32 4 conv 64 3 x 3 / 1 104 x 104 x 32 -> 104 x 104 x 64 5 max 2 x 2 / 2 104 x 104 x 64 -> 52 x 52 x 64 6 conv 128 3 x 3 / 1 52 x 52 x 64 -> 52 x 52 x 128 7 max 2 x 2 / 2 52 x 52 x 128 -> 26 x 26 x 128 8 conv 256 3 x 3 / 1 26 x 26 x 128 -> 26 x 26 x 256 9 max 2 x 2 / 2 26 x 26 x 256 -> 13 x 13 x 256 10 conv 512 3 x 3 / 1 13 x 13 x 256 -> 13 x 13 x 512 11 max 2 x 2 / 1 13 x 13 x 512 -> 13 x 13 x 512 12 conv 1024 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x1024 13 conv 1024 3 x 3 / 1 13 x 13 x1024 -> 13 x 13 x1024 14 conv 55 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x 55 15 detection mask_scale: Using default '1.000000' Loading weights from /home/robot/catkin_ws/src/darknet_ros/weights/tiny-yolo-voc_final.weights...Done! detector init OK im width=1920, height=1080 Predicted in 0.118605 seconds. hand: 61% base: 71% screw_driver: 43% shell_1: 59% screw_short: 67% class: 0 prob:0.610887 left: 513 right: 624 top: 608 bot: 717 class: 4 prob:0.705182 left: 748 right: 919 top: 629 bot: 811 class: 1 prob:0.430879 left: 515 right: 609 top: 752 bot: 850 class: 5 prob:0.586202 left: 1009 right: 1095 top: 777 bot: 859 class: 2 prob:0.668672 left: 627 right: 679 top: 888 bot: 936 Object Detected:5 Class: 0 Prob:0.610887 x: 513 y: 608 w: 624 h: 717 Class: 4 Prob:0.705182 x: 748 y: 629 w: 919 h: 811 Class: 1 Prob:0.430879 x: 515 y: 752 w: 609 h: 850 Class: 5 Prob:0.586202 x:1009 y: 777 w:1095 h: 859 Class: 2 Prob:0.668672 x: 627 y: 888 w: 679 h: 936 init done

while i think maybe we should perfect it together.

pgigioli commented 6 years ago

Looks good to me. The only reason that single_image_test does not predict short_screw is because the default threshold is 0.3.

pgigioli / darknet_ros

huge different result of recognition between darknet yolo and the single_image_test node for the same specified image #9

Objects: 5

Objects: 4

filter=num×(classes + coords + 1)=5*(6+4+1)=55

filter=num×(classes + coords + 1)=5*(6+4+1)=55

include "yolo_ros.h"

include <ros/ros.h>

include <image_transport/image_transport.h>

include <cv_bridge/cv_bridge.h>

include <sensor_msgs/image_encodings.h>

include <sensor_msgs/Image.h>

include <geometry_msgs/Point.h>

include < vector>

include < iostream>

include

include <std_msgs/Int8.h>

include

include <darknet_ros/bbox_array.h>

include <darknet_ros/bbox.h>

include<opencv2/core/core.hpp>

include<opencv2/highgui/highgui.hpp>

include

include "box.h"