eth-ait / 4d-dress

Official repository for CVPR 2024 highlight paper 4D-DRESS: A 4D Dataset of Real-world Human Clothing with Semantic Annotations.
https://eth-ait.github.io/4d-dress/
65 stars 2 forks source link

Problem with cameras. Offset is None #3

Open Andreus00 opened 3 weeks ago

Andreus00 commented 3 weeks ago

Hi,

First of all, thank you for your work!!!

I am trying to load camera parameters from the cameras.pkl file, and the mesh as a point cloud to use with Gaussian Splatting.

To do so, I wrote this: `

def read4dDressInfo(cfg) -> SceneInfo:

# open info from basic_info.pkl
basic_info = open_pickle(cfg.source_path, "basic_info.pkl")
scan_frames = basic_info['scan_frames']
scan_rot = basic_info['rotation']

offset = basic_info['offset']

# load scan mesh
_, scan_mesh, _, _ = load_scan_mesh(cfg.obj_path, rotation=scan_rot, offset=offset)

# load cameras from cameras.pkl
cameras = open_pickle(cfg.source_path, "Capture", "cameras.pkl")

# For each camera, read image, mask and camera.
cam_infos = []

# For each camera, create a CameraInfo object
for key, camera in cameras.items():

    camera_intrinsics = camera['intrinsics']
    camera_extrinsics = camera['extrinsics']

    R = camera_extrinsics[:3, :3]
    R = np.transpose(R)
    T = camera_extrinsics[:3, 3]

    image_path = os.path.join(cfg.source_path, "Capture", key, "images", f"capture-f00011.png")
    image = open_image(image_path)
    mask_path = os.path.join(cfg.source_path, "Capture", key, "masks", f"mask-f{scan_frames[0]}.png")
    mask = open_image(mask_path)

    f_x = camera_intrinsics[0, 0]
    f_y = camera_intrinsics[1, 1]
    width = image.size[0]
    height = image.size[1]

    fov_x = 2 * np.arctan(width / (2 * f_x))
    fov_y = 2 * np.arctan(height / (2 * f_y))

    alpha_image = Image.new("RGBA", image.size, (255, 255, 255, 255))
    alpha_image.paste(image, (0, 0), mask)
    print(np.asarray(alpha_image).shape)

    cam_info = CameraInfo(
        uid=key,
        R=R,
        T=T,
        FovY=fov_y,
        FovX=fov_x,
        image=alpha_image,
        image_path=image_path,
        image_name=f"side-{key}-capture-f{scan_frames[0]}",
        width=width,
        height=height
    )
    cam_infos.append(cam_info)

# get nerf normalization
nerf_normalization = getNerfppNorm(cam_infos)

# create a point cloud from the mesh. This simply takes vertices and normals from the mesh and packs them into a BasicPointCloud object. The colors of Gaussians are set to normals.
scales, opacity, pcd = mesh_to_pointcloud(scan_mesh)

images_dir = os.path.join(cfg.source_path,cfg.subj,cfg.outfit,cfg.seq)
ply_path = os.path.join(images_dir, "points3d.ply")

return scales, opacity, SceneInfo(
    point_cloud=pcd,
    train_cameras=cam_infos,
    test_cameras=[],
    nerf_normalization=nerf_normalization,
    ply_path=ply_path
)

`

However, I am having a problem as the point cloud (created from the mesh) and the image are misaligned. I plotted both the image and the render from GS, and these are some examples of the misalignment: immagine immagine

If I remove the rotation from the mesh, both the subjects look in the same direction, but they remain misaligned: immagine immagine

Am I missing something?

Thank you.

p.s. I noticed that basic_info['offset'] is always None for the sample that I am using (0112 - Inner - Take2). Maybe that's the problem?

Andreus00 commented 3 weeks ago

Update: By setting mcentral to False, the alignment improves a lot, but it still seems like all the cameras are a little bit to the left: immagine immagine

azuxmioy commented 2 weeks ago

Hi all, thanks for the message.

I might need some time to investigate this issue due to the upcoming CVPR conference.

At the same time, if @WenbWa has any ideas, please feel free to comment.

Thanks.

azuxmioy commented 2 weeks ago

btw, what rendering pipeline are you using?

have you tried our demo code here? https://github.com/eth-ait/4d-dress/blob/baf3e8f0857f7b22996512ba82a55c9530f268ce/dataset/extract_garment.py

Andreus00 commented 1 week ago

btw, what rendering pipeline are you using?

I am using Gaussian Splatting's rendering pipeline

have you tried our demo code here? https://github.com/eth-ait/4d-dress/blob/baf3e8f0857f7b22996512ba82a55c9530f268ce/dataset/extract_garment.py

I used it to load my cameras during my tests, but it did not work. I removed lines 23 and 25 to make it work with GS's cameras and used with/height instead of p_x/p_y as those were not working, but the remaining lines are based on those that you linked.

p.s. I bypassed the problem by creating my custom cameras and creating synthesized images from the mesh. However, understanding how to align images to the mesh is something that I will probably need later.