Open umutyazgan opened 7 months ago
Hi Umut,
has this been resolved?
I played around with your data and there seem to be a couple of issues.
(1) The pose matrix [R|t] appears to not transform to the OpenCV camera coordinate system (z-axis pointing away from the camera, y-axis pointing down, and x-axis pointing right). From what I've seen, one must convert to the proper coordinate system by rotating 180° around the x-axis, i.e., setting R = S*R
and t = S*t
with S = diag([-1, 1, -1])
After the transformation, this is one possible bounding box that generates a visual hull mesh:
aabb = AABB(torch.tensor([[0.25, 0.15, 0.05], [0., -0.1, -0.3]]).numpy())
(2) I think the calibration might be flawed. If you inspect the images, your camera path is similar to a half circle around the object (first image shows the handle pointing towards the camera, last image the handle is roughly on the other side pointing away from the camera). However, if you plot the camera trajectory, the poses do not reflect this half-circle motion (see image, the blue lines show the viewing direction of each camera)
The yellow dots are the AABB corners and the grey mesh is the visual hull result, which is pretty bad.
Hi. Thanks for the answer. I could not figure out what was wrong with our camera pose extraction method. I've switched to a different method: I put a textured 3d model in a Blender scene and use a script to take "photos" of that object, while recording the camera poses. Now I don't get the "Surface level must be within volume data range."
error anymore, but my results are really bad.
The images and matrices, along with the Blender scene and script I'm using are here, if you would like to have a look: https://drive.google.com/drive/folders/14aKW63dTdO4Gj20FzA5QzMn_A5XIwYXV?usp=sharing
I tried changing the resolution of the images to 1024*1024, reduced cameras distance to the object to 2 units. This triggered the "Surface level must be within volume data range."
error again.
By the way, how are you visualizing the cameras as in the image you have attached? That could help me figure out what's wrong with my camera setup.
Hi. I'm trying to reconstruct an object captured via Record3D. I have extracted camera matrices from the .r3d file, added alpha channels to the captured images to act as masks, using this background removal tool, then resized images to 384*512, and also resized the K matrices accordingly. I've attached the resulting dataset below.
cup_384_512.zip
I have tried various bounding box sizes. I tried calculating it from the camera positions. One issue is that all cameras are positioned in front of the object so this method may not work. Then, just to test it out, I tried entering increasingly large values as bounding box, from
[[-0.5, -0.5, -0.5],[0.5, 0.5, 0.5]]
to[[-1000, -1000, -1000],[1000, 1000, 1000]]
, but nothing seems to work, I keep getting theValueError: Surface level must be within volume data range.
error. Any idea what am I doing wrong? I can add more details about how I generated the matrices if necessary.