DLR-RM / AugmentedAutoencoder

Official Code: Implicit 3D Orientation Learning for 6D Object Detection from RGB Images
MIT License
339 stars 97 forks source link

ICP on 16bit depth image #28

Closed flugenheimer closed 5 years ago

flugenheimer commented 5 years ago

Is there a way adapt it to support a 16bit depth image. If I use it directly it does not work, as seen by the print out in icp.py in the icp_refinement method:

real min max x -18383.785420814358 16734.17997734934
real min max y -18381.14515417148 12457.235152135814
real min max z 74000.0 149000.0
syn min max x -43.666910471988324 7.3976425377620245
syn min max y -15.081647411564028 32.52808661089217
syn min max z 270.7138977050781 311.607055664062

If i just load it as 8bit the numbers look closer i guess:

real min max x -123.38111020680778 122.1472991047397
real min max y -123.36339029645289 122.12975639348836
real min max z 1000.0 1000.0
syn min max x -43.666910471988324 7.3976425377620245
syn min max y -15.081647411564028 32.52808661089217
syn min max z 270.7138977050781 311.6070556640625

However when I inspect the 8bit image i do not get a depth difference in my object due to the lower depth resolution. Im therefore pretty sure i need the 16 bit depth.

I am currently just testing by loading image files with opencv, so they load as uint8 and uint16.

MartinSmeyer commented 5 years ago

uint16 should be fine, but your number are way off. I loaded the depth files using the sixd_toolkit functions defined in inout.py, i.e.:

def load_depth2(path):
    d = scipy.misc.imread(path)
    d = d.astype(np.float32)
return d

The depth images should be in float32 and mm scale in the end.

flugenheimer commented 5 years ago

ah that solved one issue, however i am running into another, as the depth image in meshrenderer_phong.py render_many function is all zero, which means that ys and xs will be [ ], which crashes when the boundary boxes are calculated below

glNamedFramebufferReadBuffer(self._fbo_depth.id, GL_COLOR_ATTACHMENT1)
depth_flipped = glReadPixels(0, 0, W, H, GL_RED, GL_FLOAT).reshape(H,W)
depth = np.flipud(depth_flipped).copy()
ys, xs = np.nonzero(depth > 0)
obj_bb = misc.calc_2d_bbox(xs, ys, (W,H))
bbs.append(obj_bb)

giving the following ValueError: ValueError: zero-size array to reduction operation minimum which has no identity

flugenheimer commented 5 years ago

Changing the depth_scale in the aae_retina_webcam.cfg file results in everything working, however I see no visual difference in the results of using ICP compared to not using it.