ch3cook-fdu / Vote2Cap-DETR

[CVPR 2023] Vote2Cap-DETR and [T-PAMI 2024] Vote2Cap-DETR++; A set-to-set perspective towards 3D Dense Captioning; State-of-the-Art 3D Dense Captioning methods
MIT License
76 stars 5 forks source link

How to visualize the result? #11

Closed iris0329 closed 4 months ago

iris0329 commented 4 months ago

Hi, thanks for sharing this awesome work.

I noticed that you mentioned in another issue that

You can use the tools in this repo to help

but the demo.py outputs just a JSON file.

So could you give me some ideas on how to use the provided 3d-pc-box-viz repo to visualize the JSON file?

ch3cook-fdu commented 4 months ago

I hope the following code could solve your problem:

import argparse
import json, os, torch, numpy as np
from ply_helper import write_bbox

COLOR = [237, 125, 49]
def visualize_one_scene(filename: str, output_dir: str) -> None:

    predicts = json.load(open(filename, 'r'))

    for pred_idx, pred_info in enumerate(predicts):
        write_bbox(
            np.asarray(pred_info['box']), 
            COLOR,
            os.path.join(output_dir, f"{pred_idx}-{pred_info['caption']}.ply")
        )
    return

if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('--json_fn', default=None, type=str, help='specified json filename for visualization')
    parser.add_argument('--json_folder', default=None, type=str, help='specified json folder for visualization')
    parser.add_argument('--output_dir', default=None, type=str, help='output dir for ply files')
    args = parser.parse_args()

    file_list = []
    if args.json_folder is not None:
        file_list = list(
            map(
                lambda fn: os.path.join(args.json_folder, fn), 
                os.listdir(args.json_folder)
            )
        )
    if args.json_fn is not None:
        file_list.append(args.json_fn)

    for filename in file_list:
        scan_name = os.path.basename(filename).replace('.json', '')
        output_dir = os.path.join(args.output_dir, scan_name)
        os.makedirs(output_dir, exist_ok=True)
        visualize_one_scene(filename, output_dir)
iris0329 commented 4 months ago

Thanks a lot, it truly output some bbox files

image

but it seems it cannot align well with the scannet scene, do you meet the similar problem?

ch3cook-fdu commented 4 months ago

As can be seen from https://github.com/ch3cook-fdu/Vote2Cap-DETR/blob/master/data/scannet/load_scannet_data.py#L65-L71, the input 3D mesh has gone through an affine transformation.

Here is an example code for this process:

import numpy as np
from ply_helper import write_ply, read_mesh_vertices_rgb_normal

scene_id = 'scene0246_00'
# point_cloud, mesh = read_mesh_vertices_rgb_normal(scene_id + '_vh_clean.ply')
point_cloud, mesh = read_mesh_vertices_rgb_normal(scene_id + '_vh_clean_2.ply')

# Load scene axis alignment matrix
lines = open(scene_id + '.txt').readlines()
axis_align_matrix = None
for line in lines:
    if 'axisAlignment' in line:
        axis_align_matrix = [float(x) for x in line.rstrip().strip('axisAlignment = ').split(' ')]

axis_align_matrix = np.array(axis_align_matrix).reshape((4,4))
pts = np.ones((point_cloud.shape[0], 4))
pts[:,0:3] = point_cloud[:,0:3]
pts = np.dot(pts, axis_align_matrix.transpose()) # Nx4
aligned_vertices = np.copy(point_cloud)
aligned_vertices[:,0:3] = pts[:,0:3]

write_ply(aligned_vertices[:, :3], aligned_vertices[:, 3:6] / 256., mesh, scene_id + '_vh_aligned.ply')
iris0329 commented 4 months ago

I have rectified the issue, thanks to your thorough response.

Best wishes!