[gen2-box-measurement] Highly interested on finding box corners in RGB image - Githubissues

luxonis / depthai-experiments

Experimental projects we've done with DepthAI.

MIT License

809 stars 362 forks source link

[gen2-box-measurement] Highly interested on finding box corners in RGB image #397

Open miknai opened 2 years ago

miknai commented 2 years ago

Hello,

Currently, this example only finds the corners of a box and draw red lines along to the box edges in point cloud image. I am highly interested to find the box corners in RGB image domain for my project. I saw that initially the original developer planned to draw box in RGB image as well after figuring out the reverse transformation from the point cloud domain (screen capture attached). Is this project paused for now?

Screenshot from 2022-09-06 10-53-48 ^ a screen capture from box_measuring_demo.ipynb

Thank you

moratom commented 2 years ago

Hi! Actually there has been some further development on the case, @klemen1999 worked on it, so he can share more.

klemen1999 commented 2 years ago

Hi, I added reverse transformation with visualization in this pull request. I hope this helps.

miknai commented 1 year ago

Moving my comments to here...

"I figured out my issue. After doing "pip3 install -r requirements.txt" with the requirement file in box measurement dir, the latest version of depthai (depthai==2.17.3.0) gets overwritten by the old one, depthai==2.16.0.0 which is specified in the file. With the certain version of depthai and the latest FW, it gave me difficult time to use the stereo sensors.

After upgrading the depthai version to the latest, I was able to avoid the issues I wrote above. It would be a good idea to update the depthai version in requirements.txt

Other than that, I don't have any issues. Thank you for you effort implementing the feature very quickly. Much appreciate it."

miknai commented 1 year ago

Quick question. The current code works well with an assumption that the camera is held in a normal orientation. In my case, the camera is 90-degree rotated to cover more vertical area. Currently, if I run the code with rotated camera, the box estimation does not work. The screenshot is attached. @klemen1999, could you help me on this modification? It would be great if the code works with 90-degree rotated camera.

Drawing-9 sketchpad

miknai commented 1 year ago

Maybe applying "rotate" preprocessing for each frame would fix the issue? Do you think there's more efficient way to resolve the issue?

moratom commented 1 year ago

I tried the demo with the rotated camera and it seemed to work fine. Does produce visualization like in the screenshot above consistently, or switches between the right and wrong visualization? Also which OAK are you using?

miknai commented 1 year ago

@moratom It shows like the screenshot consistently. More specifically, when I was holding the camera in normal and 90-degree rotated (clockwise). the code worked fine. But, if I rotated it by - 90-degree (counter-clockwise), the found box boundary became off. The video is attached.

moratom commented 1 year ago

Got it, I probably rotated the camera clockwise... I will test tomorrow and try to find the issue.

moratom commented 1 year ago

Rotating the image however shouldn't be much of a performance hit, since making the detections limits the FPS to around 5 already.

miknai commented 1 year ago

@moratom Currently, the resolution of the color image is very low. It is set as 1/3 of THE_1080_P which is 640x360. I was able to set as 1/3 of THE_4_K, but with anything more than that, the program crashes. Is there a way I can use higher resolution for this example? I eventually would like to read some characters on the box planes, but the cropped images of interested area are so blurred.

klemen1999 commented 1 year ago

The freezing with higher resolution happens because the pointclouds become much bigger and we can't process them anymore. I would suggest setting higher resolution but then resizing color and depth frames on the host before we create pointcloud out of them (so before pcl_converter.rgbd_to_projection()). Note: You have to also change width and height in getCameraIntrinsics() to shape of resized images and use cv2.INTER_NEAREST interpolation when resizing depth.

miknai commented 1 year ago

@klemen1999 Thanks for the reply. Currently, I use mono and depth image to generate the point cloud and display the box edges (and corners) in the mono image from RIGHT mono sensor. My goal is to find the corners in a high resolution (12MP) color image.

I was able to roughly transfer the found corners from RIGHT mono image to 12MP color image by using some hardcoded scale factor and offsets, but I would like to use more elegant way.

Questions:

I use a script node to capture the 12MP image... In that way, I believe, I can transfer the color image data only when the triggering signal is sent from host PC to camera, so I can reduce some computation coming from keep adding image frames in a queue on camera. Am I understanding correctly?
Found out that the FOVs from RIGHT mono sensor and color sensor are slightly different because they use different sensor products and are attached in different physical location. Would it be hard to find the corresponding corners in 12MP color image from RIGHT mono image?
For your suggestion acquiring high resolution images from color sensor and resize them on host, would this be okay with 12MP images? I believe, even though I don't stream 12MP as a video, I still have them in the internal queue inside VPU (please correct me if I am wrong) and it might make everything very slow. That's why I thought about using mono image to find box corners and somehow transfer them in the 12MP color image that the program captures with a trigger signal... Is my understanding correct?

I would much appreciate your feedback.