dusty-nv / ros_deep_learning

Deep learning inference nodes for ROS / ROS2 with support for NVIDIA Jetson and TensorRT
879 stars 257 forks source link

anyway to visualize the detect result like in jet-inference? #28

Open vxgu86 opened 4 years ago

vxgu86 commented 4 years ago

anyway to visualize the detect result like in jet-inference?

gclarke42 commented 4 years ago

I was looking to do the same and managed to jurryrig the code to do this. I don't want to actually submit the change cause I don't feel like going through the full test process to make sure this is sound so use at your own risk. Also I didn't verify this was all the changes so if i missed something let me know. And if anyone is interested in cleaning the changes up and putting it into the release as a parameter or something feel free to.

Make the following code changes In image_converter.cpp Add this method also add declaration to the header--

// ConvertBack
bool imageConverter::ConvertBack( sensor_msgs::Image& output )
{
  // convert to RGBA32f format
  if( CUDA_FAILED(cudaRGBA32ToBGR8((float4*)mOutputGPU, (uchar3*)mInputGPU, mWidth, mHeight)))
  {
    ROS_ERROR("failed to convert %ux%u image with CUDA", mWidth, mHeight);
    return false;
  }

  // copy shared memory to output
  memcpy(output.data.data(), mInputCPU, mWidth * mHeight * sizeof(uchar3)); // note: 3 channels assumes bgr/rgb
  ROS_INFO("post memcpy");
  return true;
}

In node_detectnet.cpp add #include <image_transport/image_transport.h>

under the other impage publisher ptr add image_transport::Publisher* image_pub = NULL;

near the top of img_callback replace the existing line with this const int numDetections = net->Detect(cvt->ImageGPU(), cvt->GetWidth(), cvt->GetHeight(), &detections, (detectNet::OVERLAY_BOX | detectNet::OVERLAY_LABEL | detectNet::OVERLAY_CONFIDENCE)); You should be able to configure the box with this line, this version has everything enabled.

at the end of img_callback replace the existing detection_publish with these lines

// publish the detection message
    sensor_msgs::Image output;
    output.data = input->data;
    output.step = input->step;
    output.width = input->width;
    output.header = input->header;
    output.height = input->height;
    output.encoding = input->encoding;
    output.is_bigendian = input->is_bigendian;

    cvt->ConvertBack(output);
    image_pub->publish(output);
    detection_pub->publish(msg);

in main just under the detection_pub = &pub; insert the following

image_transport::ImageTransport it_(private_nh);
  const std::string defaultOutput("/camera/image_det_output");
  image_transport::Publisher pub2 = it_.advertise(defaultOutput, 1);
  image_pub = &pub2;
matthaeusheer commented 4 years ago

@gclarke42 Thanks for the snippets. I think what is missing is the declaration of image_pub in node_detectnet.cpp:
image_transport::Publisher* image_pub = NULL;

When visualizing the /camera/image_det_output in rviz using detectNet::OVERLAY_BOX as an overlay, the conversion crashes. At the beginning I can see some blue box flickering shortly sometimes where the detections should be, then I get

[ERROR] [1584626931.491593779]: back-conversion: failed to convert 640x480 image with CUDA
[ERROR] [1584626931.500285776]: conversion: failed to convert 640x480 image with CUDA
[ERROR] [1584626931.533418376]: conversion: failed to convert 640x480 image with CUDA
...

Any idea how to fix this? Appreciate your work. It does not crash when using OVERLAY_NONE but then the whole point is kind of gone. As a work around one could probably draw directly on the back-converted image using openCV.

gclarke42 commented 4 years ago

@matthaeusheer I updated the first response with a couple of missing lines, the first is the ptr you mentioned, the 2nd is a line change that enables the bounding box in img_detect. As you correctly guessed OVERLAY_NONE is rather pointless:P I did notice that with some models, moving the camera around a bunch or having too many detentions caused a crash, i didn't really look into the why much yet, I have had pretty good luck using ssd-mobilenet-v2. I am using a rasppi V2 camera with 1280/720 maxfps of 30 in case it helps debug any. The stream is coming from gstreamer, I then use cv_bridge to convert the image to bgr and pipe that into the node.

It sounds like you had both of those missing lines covered already, the only difference i see jump out is I used all 3 bounding box options instead of just the plain box. I'm not sure about the errors, just based on the message it seems to be having a problem running the rgb-bga conversion in CUDA. I'm not well versed in CUDA, when writing that function i basically just reversed everything that it did in the initial conversion in order to copy the modified image from cvt back into an image ptr. RVIZ should be able to process the BGR image i think, so you might be able to skip that step and instead just copy the updated image to the stream ptr.

matthaeusheer commented 4 years ago

I did notice that with some models, moving the camera around a bunch or having too many detentions caused a crash Same here, which is pretty much a no-go for my application (running on a drone) :)

I did a workaround by simply drawing detection boxes using openCV, my ros_deep_learning fork, which works just fine. So far I was running on my laptop (Dell XPS15, GTX960M, Ubuntu 18.04 with 720p usb-webcam) but I will test it on the Jetson TX2 as well to figure out if there is some difference. I did use facenet and pednet networks. In my case I published the images from video_stream_opencv but I think ones ros_deep_learning has the images this should not be of relevance.

Let's see, @dusty-nv, do you have some hints on this one?

jamesthesken commented 4 years ago

Thank you @gclarke42 those code snippets worked on the Nano. Will report back on some testing or if I come across any other ideas