facebookresearch / ImageBind

ImageBind One Embedding Space to Bind Them All
Other
8.39k stars 770 forks source link

How to load depth data? #122

Open luigilella98 opened 3 months ago

luigilella98 commented 3 months ago

Thank you for your work and the code. I wanted to ask you how to load depth into the model as there is no specific "load_and_transform_depth_data" method. Should I use "load_and _transform_vision_data"?

Kerio99 commented 2 months ago

Hi, I have the same question. Have you find the answer?

luigilella98 commented 1 month ago

@Kerio99 I created a function from scratch, but it would be better if they provided it.


def load_and_transform_depth_data(depth_paths, device):
    if depth_paths is None:
        return None
    device = torch.device(device)

    depth_outputs = []
    for depth_path in depth_paths:
        data_transform = transforms.Compose(
            [
                transforms.Resize(
                    224, interpolation=transforms.InterpolationMode.BICUBIC
                ),
                transforms.CenterCrop(224),
                transforms.ToTensor(),
            ]
        )
        with open(depth_path, "rb") as fopen:
            image = Image.open(fopen).convert("L")

        image = np.array(image, dtype=np.float32) / 255.0
        disparity = Image.fromarray(image)
        #plt.imshow(image, cmap='Spectral')
        #plt.show()
        #mask = image > 0 
        #image = image - image[mask].min()
        #disparity = 1.0 / image
        #disparity = (disparity - disparity[mask].min())/(disparity[mask].max() - disparity[mask].min())
        #disparity[mask == 0] = 0
        #disparity = np.clip(disparity, 0, 1)
        plt.imshow(image, cmap='Spectral')
        plt.show()
        #disparity = Image.fromarray(disparity)

        disparity = data_transform(disparity).to(device)

        depth_outputs.append(disparity)

    return torch.stack(depth_outputs, dim=0)