zju3dv / pvnet

Code for "PVNet: Pixel-wise Voting Network for 6DoF Pose Estimation" CVPR 2019 oral
Apache License 2.0
812 stars 144 forks source link

There's some confusion about using 'read_data' to load pose. #88

Closed GraceJary closed 5 years ago

GraceJary commented 5 years ago

When I run demo.py ,I am confused that why the point2d is calculated according to the pose file you loaded, but not the one which network predict?


def read_data():
    import torchvision.transforms as transforms

    demo_dir_path = os.path.join(cfg.DATA_DIR, 'demo_box')
    rgb = Image.open(os.path.join(demo_dir_path, '33.jpg'))
    mask = np.array(Image.open(os.path.join(demo_dir_path, '33_depth.png')))
    mask[mask != 0] = 1
    points_3d = np.loadtxt(os.path.join(demo_dir_path, 'greenbox_3d.txt'))
    bb8_3d = np.loadtxt(os.path.join(demo_dir_path, 'greenbox_bb8_3d.txt'))
    #pose加载错误
    pose = np.load(os.path.join(demo_dir_path, 'cat_pose.npy'))

    projector = Projector()
    points_2d = projector.project(points_3d, pose, 'linemod')
    vertex = compute_vertex(mask, points_2d)

    transformer = transforms.Compose([
        transforms.ToTensor(),
        transforms.Normalize(mean=[0.485, 0.456, 0.406],
                             std=[0.229, 0.224, 0.225])
    ])

    rgb = transformer(rgb)
    vertex = torch.tensor(vertex, dtype=torch.float32).permute(2, 0, 1)
    mask = torch.tensor(np.ascontiguousarray(mask), dtype=torch.int64)
    vertex_weight = mask.unsqueeze(0).float()
    pose = torch.tensor(pose.astype(np.float32))
    points_2d = torch.tensor(points_2d.astype(np.float32))
    #后4个量都是自己读入的
    data = (rgb, mask, vertex, vertex_weight, pose, points_2d)

    return data, points_3d, bb8_3d
pengsida commented 5 years ago

They are ground truth points_2d.

GraceJary commented 5 years ago

Thank you ,I understand it now. By the way, how can I get .npy file like the 'cat_pose.npy' in the demo?


pose = np.load(os.path.join(demo_dir_path, 'cat_pose.npy'))
yunxijun commented 5 years ago

@GraceJary You can log out npy and the program can be changed so that you only need to input a picture and add Point3d and bb83d to get the output as shown below. 4951702e4c8fa85a8a50afc90733288

GraceJary commented 5 years ago

When I use a new 3d model to render images. It seems like to need /media/srt/dataset/pvnet-rendering/data/LINEMOD/redbox/training_range.txt .Because the sample pose function need it . I don't know how to solve the problem .

yunxijun commented 5 years ago

@GraceJary I have not done my own model, but I have done rendering LINEMOD to get pictures.

GraceJary commented 5 years ago

I have also done rendering with Linemod Dataset. And I get the results like yours. But I don't know how to solve the problem about rendering a new model ,when I lack the training_range.txt file.

pengsida commented 5 years ago

You need to read the code and revise it.

GraceJary commented 5 years ago

If I only have the render images, I would not be able to generate "training_range.txt". So this confuses me.