baaivision / Uni3D

[ICLR'24 Spotlight] Uni3D: 3D Visual Representation from BAAI
MIT License
488 stars 28 forks source link

Question about Customized Dataset #11

Open yuanze1024 opened 9 months ago

yuanze1024 commented 9 months ago

Hello, I'm using Uni3D to try to extract embeddings from 3D models from Objaverse. The format of 3d models are .glb, and I don't know how to transform it into .npy that you use in Objaverse-LVIS. Do you know how to make it or where can I find it?

BTW, the code snippet below is how I manage to extract the embedding of 3D models. Am I missing something?

@torch.no_grad()
def encode(dataloader):
    model = create_uni3d()
    model.eval()
    model.to(device)
    embeddings = []
    for (pc, _, _, rgb) in tqdm(dataloader, desc="Extracting embeddings"):
        pc = pc.to(device=device, non_blocking=True)
        rgb = rgb.to(device=device, non_blocking=True)
        feature = torch.cat((pc, rgb), dim=-1)
        pc_features = model.encode_pc(feature)
        pc_features = pc_features / pc_features.norm(dim=-1, keepdim=True)
        embeddings.append(pc_features)
    embeddings = torch.cat(embeddings, dim=0)
    return embeddings
yuanze1024 commented 9 months ago

Hello, I'm using Uni3D to try to extract embeddings from 3D models from Objaverse. The format of 3d models are .glb, and I don't know how to transform it into .npy that you use in Objaverse-LVIS. Do you know how to make it or where can I find it?

BTW, the code snippet below is how I manage to extract the embedding of 3D models. Am I missing something?

@torch.no_grad()
def encode(dataloader):
    model = create_uni3d()
    model.eval()
    model.to(device)
    embeddings = []
    for (pc, _, _, rgb) in tqdm(dataloader, desc="Extracting embeddings"):
        pc = pc.to(device=device, non_blocking=True)
        rgb = rgb.to(device=device, non_blocking=True)
        feature = torch.cat((pc, rgb), dim=-1)
        pc_features = model.encode_pc(feature)
        pc_features = pc_features / pc_features.norm(dim=-1, keepdim=True)
        embeddings.append(pc_features)
    embeddings = torch.cat(embeddings, dim=0)
    return embeddings

I just noticed that in your paper Sec 4.6, you used Objaverse to testify retrieval ability. Can you tell me how did you guys do it?

junshengzhou commented 9 months ago

Hi,

Thanks for your interest in Uni3D. You can easily sample point clouds and the corresponding colors from the mesh file (e.g. glb) with some python tools like Trimesh or some software, e.g. Meshlab. If you have trouble in sampling colors, you can simply set the color to 0.4 as we handle the point clouds without color.

Your code for extracting embeddings with Uni3D seems to be correct, but you should remember to normalize the point cloud as we did in the dataset file. The retrieval process is simple, where you can first collect a set of model embeddings and find the most similar 3D model to the text/image embeddings.

yuanze1024 commented 9 months ago

Thank you for your really quick response. But I still get some questions.

  1. Can you share us the code that you use to get the Objaverse-LVIS dataset? In order to gain the best performance, I think it is important to align the input.
  2. Why don't you use the color channel? I have seen this behavior in 3D Object Detection, but still can't understand it. After all, in my opinion it implies some patterns that benefit.
  3. If you don't need the color channel, why not just use the xyz but concat them together?
  4. And finally what is the meaning of the number 0.4?
yuanze1024 commented 9 months ago

I just found that in openshape's hf, they released a code snippet to transfer .glbs. Can I use this?