Open Bokai-Ji opened 1 month ago
Hello, thank you for your interest in our work!
First, since we applied normalization during both training and inference, our output bounding boxes are also in the normalized space. You might want to check if the visualization code accounts for this normalization.
Second, the current grounding performance is still in a very early stage, as you can observe from the low values in Table 6 of the paper.
Thank you for your quick response!!
I would like to clarify whether I have performed the normalization correctly. I referred to the code in $ShapeLLM/llava/serve/cli.py
for loading and normalizing point clouds (process_pts). The process_pts()
function first applies random_sample()
to the point clouds and then uses pc_norm()
for normalization.
Below is my testing code:
pts = np.load("textured_objects/47254/47254.npy")
pcd = o3d.geometry.PointCloud()
pcd.points = o3d.utility.Vector3dVector(pc_norm(pts[:,:3]))
pcd.colors = o3d.utility.Vector3dVector(pts[:,3:])
o3d.visualization.draw_geometries([pcd, line_set])
In this code, I simply used pc_norm()
to normalize the point clouds after loading them from the .npy
file. Am I doing this correctly?
Thank you for your help!
Thank you for the excellent work! I encountered problem that I can't get ideal prediction for Embodied Object Understanding.
Here is an example of the prediction:
The rendered result is shown in the figure above, where the predicted bounding box is far from the ground-truth position. I tried several objects in PartNet-Mobility Dataset and none of the predictions are even close to the ground-truth. Is this caused by the mismatch of the axes of the point clouds and the predicted bounding boxes? Currently I'm using the preprocessing code in
mm_utils.py
, which isAppreciate for any support!