TRI-ML / vidar

Other
563 stars 67 forks source link

inference is not working #32

Open garg-sparsh opened 1 year ago

garg-sparsh commented 1 year ago

`import torch from vidar.core.wrapper import Wrapper from vidar.arch.networks.depth.PackNet import PackNet from vidar.utils.config import read_config import numpy as np import cv2 import torchvision.transforms as transforms from PIL import Image

def resize_image(image, shape, interpolation=Image.ANTIALIAS): transform = transforms.Resize(shape, interpolation=interpolation) return transform(image)

def to_tensor(image, tensor_type='torch.FloatTensor'): transform = transforms.ToTensor() return transform(image).type(tensor_type)

cfg = read_config('configs/papers/packnet/inference_packnet.yaml') net = PackNet(cfg)

rgb = Image.open('/data/vidar/DDAD_images/000081/CAMERA_01/1568648362895305.png') rgb = resize_image(rgb, (608, 968)) rgb = to_tensor(rgb).unsqueeze(0) print(rgb) depth = net(rgb=rgb)['depths'] print(depth) `

I was able to run the demos/run_network/run_network.py without issues but after applying it to the real image, it throws: Traceback (most recent call last): File "demos/run_network/run_network.py", line 48, in depth = net(rgb=rgb)['depths'] File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, *kwargs) File "/workspace/vidar/vidar/arch/networks/depth/PackNet.py", line 118, in forward x4p = self.pack4(x4) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(input, **kwargs) File "/workspace/vidar/vidar/arch/networks/layers/packnet/packnet.py", line 242, in forward x = self.pack(x) File "/workspace/vidar/vidar/arch/networks/layers/packnet/packnet.py", line 150, in packing x = x.contiguous().view(b, c, out_h, r, out_w, r) RuntimeError: shape '[1, 256, 38, 2, 60, 2]' is invalid for input of size 2354176

Diksha-agg commented 1 year ago

I did this and I got a list of 4 tensors.

import torch from PIL import Image import torchvision.transforms as transforms import matplotlib.pyplot as plt import numpy as np

from vidar.arch.networks.depth.MonoDepthResNet import MonoDepthResNet from vidar.utils.config import read_config

Create network

cfg = read_config('demos/run_network/config.yaml') net = MonoDepthResNet(cfg)

Load and preprocess the image

image_path = "frame0.jpg" # Replace with the path to your image file image = Image.open(image_path).convert("RGB") transform = transforms.Compose([ transforms.Resize((256, 256)), # Resize the image to 256x256 pixels transforms.ToTensor(), # Convert the image to a PyTorch tensor transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)) # Normalize the image ]) input_tensor = transform(image).unsqueeze(0) # Add a batch dimension to the input tensor depth = net(rgb=input_tensor)['depths'] print(depth)

Hope this might help!

Diksha-agg commented 1 year ago

You might need to resize your image to the closest power of 2. Like I had 375x1024 size of image. I resized it to 256x1024. It worked.