ethnhe / FFB6D

[CVPR2021 Oral] FFB6D: A Full Flow Bidirectional Fusion Network for 6D Pose Estimation.
MIT License
295 stars 72 forks source link

关于数据增强的一点问题 #9

Closed Mr2er0 closed 3 years ago

Mr2er0 commented 3 years ago

你好,在阅读了你的代码之后,关于数据增强的部分我有一点问题不太清楚,不知道你能否抽空解答一下。

 def add_real_back(self, rgb, labels, dpt, dpt_msk):
        real_item = self.real_gen()
        with Image.open(os.path.join(self.cls_root, "depth", real_item+'.png')) as di:
            real_dpt = np.array(di)
        with Image.open(os.path.join(self.cls_root, "mask", real_item+'.png')) as li:
            bk_label = np.array(li)
        bk_label = (bk_label < 255).astype(rgb.dtype) # 获取真实图像的背景label
        if len(bk_label.shape) > 2:
            bk_label = bk_label[:, :, 0]
        with Image.open(os.path.join(self.cls_root, "rgb", real_item+'.png')) as ri:
            back = np.array(ri)[:, :, :3] * bk_label[:, :, None]
        dpt_back = real_dpt.astype(np.float32) * bk_label.astype(np.float32)

        if self.rng.rand() < 0.6:
            msk_back = (labels <= 0).astype(rgb.dtype)
            msk_back = msk_back[:, :, None]
            rgb = rgb * (msk_back == 0).astype(rgb.dtype) + back * msk_back  # 这里真实图片的背景只有0.6的概率可以替换掉渲染图片中背景

        dpt = dpt * (dpt_msk > 0).astype(dpt.dtype) + \
            dpt_back * (dpt_msk <= 0).astype(dpt.dtype)  # 这里是一定会替换到渲染图片的背景深度图的
        return rgb, dpt

这样子背景的深度图和rgb图是否会对应不上吗,这样子的话训练的点云又是如何生成的呢? 参考链接

Mr2er0 commented 3 years ago

另外,我想问一下论文里所说的网络前向传播的测试时间包括深度图转化为点云之类的时间吗?还是只包括数据转化为所需网络输入格式之后的时间?

ethnhe commented 3 years ago
Mr2er0 commented 3 years ago

好的,感谢~