xheon / panoptic-reconstruction

Official implementation of the NeurIPS 2021 paper "Panoptic 3D Scene Reconstruction from a Single RGB Image"
https://manuel-dahnert.com/research/panoptic-reconstruction
Other
192 stars 29 forks source link

generated traning data for other method #26

Open ZhiyaoZhou opened 11 months ago

ZhiyaoZhou commented 11 months ago

the segmentation data(e.g. segmentation_0007 mapped.npz) is all the zero matries, cannot use it to train other model such as Total3DUnderstanding, i will be very appreciated of anyone could solve this problem~👀🙏🙏

xheon commented 11 months ago

Hi, can you post the complete name of the sample?

The segmentation data is very sparse, so it is very likely, that it contains a lot of zeros. To verify its content you can check that some elements are non-zero:

sem_data = np.load(sample_path)["data"]
semantic_ids, semantic_counts = np.unique(sem_data, return_counts=True)

Additionally, you can visualize the pointcloud:

occupied_voxels = torch.from_numpy(sem_data).squeeze().nonzero()
vis.write_pointcloud(occupied_voxels, None, tmp_output_path / "tmp.ply")

Regarding your second point: The data formats between ours and other methods are very different, Our method uses a sparse voxel representation of the scene (256^3) and learns objects and structures (floor, wall, etc) together. Methods like Total3DUnderstanding use individual bounding boxes for each object and one for the room layout and de-composes them.

I hope that helps. Let me know, if you have further questions.

ZhiyaoZhou commented 11 months ago

thank you for your reply! i runned the code as you suggested and it outputted some non-zero values(showed as below), so it proved that segmentation data(e.g. segmentation_0007 mapped.npz) aren't zero matrices. dc9e042895743f4c78ea2b786ddf3705 The training data formats for Total3D and Panoptic are very different(as you mentioned), for model training, Total3D used SUNRGBD which contains bounding boxes coordinates for everything in the room and the room itself and annotations etc. which is very different from generated data from Front3D. the question that I have been struggling is: I want to test the performance of one model, this model's architecture is similar to Total3D, and it needs SUNRGBD dataset for training, and I still confused that the performance result of other methods(such as Total3D) listed in the paper, if I want to use the generated data from Panoptic-reconstruction for Total3D model for training and verify its performance in PRQ etc(as mentioned in the paper Panoptic3D), how should i do? I visualized the output generated from default data, the input picture rgb_0007.npg and the output/sample_0007/points_surface_instances.ply show as below: input picture rgb_0007.npg: rgb_0007 output/sample_0007/points_surface_instances.ply: f5dfc21ddd86df218d83b01332634a7b thanks for the amazing job in Panoptic3D, the pointcloud has reconstructed the picture perfectly, should i use the pointcloud coordinates for every stuffs respectively to construct a dataset like SUNRGBD and train the model? Any advice will be very appreciated~👀🙏

xheon commented 10 months ago

Hi, sorry for the delay.

To get data in a Total3D-like format, i.e. per-object bounding boxes, it is easier to parse the per-object information directly from the original 3D-Front .json file and transform the object bounding boxes into the per-frame camera space of the Panoptic-3D views data:

A rough outline of the steps:

You can have a look here at this pickle: sample data

To visualize it you can extract the per-object points:

import pickle
#import pickle5 as pickle
import trimesh

sample_path = "30054695-4a17-4698-a482-06047262a526_0007.pkl"

with open(sample_path, "rb") as f:
    data = pickle.load(f)

for idx, points in data["boxes"]["gt_points"]:
    trimesh.PointCloud(points).export(f"object_{idx:02d}.ply")