I wonder if i could get some help with my own RGB input

dwofk / fast-depth

ICRA 2019 "FastDepth: Fast Monocular Depth Estimation on Embedded Systems"

MIT License

930 stars 188 forks source link

I wonder if i could get some help with my own RGB input #11

Closed GustavoCamargoRL closed 5 years ago

GustavoCamargoRL commented 5 years ago

I'm trying to test with my own inputs, but i'm not quite sure how to do it. I thought it was in the dataloader.py code, but when i tried debugging it apparently this class is for the NYU dataset right? If you could explain how to proper do it, it will be very helpful.

Thanks!

fangchangma commented 5 years ago

Hi. To test with your own images, simply read them in as a 4-dimensional pytorch floating point tensor of size 1x224x224x3. The raw rgb values [0,255] should be divided by a constant factor 255.0, such that all pixel values fall in the range of [0, 1].

GustavoCamargoRL commented 5 years ago

I tried doing this but i've got this error "RuntimeError: Given groups=1, weight of size 32 3 3 3, expected input[1, 244, 244, 3] to have 3 channels, but got 244 channels instead". My image is 244x244 and i'm giving the right format, as you can see here:

import matplotlib.pyplot as plt import numpy as np img = plt.imread("img.jpg")/255. img.shape (244, 244, 3) img = np.expand_dims(img, axis=0) img.shape (1, 244, 244, 3) i = torch.from_numpy(img) Traceback (most recent call last): File "", line 1, in NameError: name 'torch' is not defined import torch i = torch.from_numpy(img) i.shape torch.Size([1, 244, 244, 3])

So i don't know why this error is occurring, if i'm giving the exact same format.

dwofk commented 5 years ago

The PyTorch conv2d function assumes inputs to be in 'NCHW' format, meaning that the tensor you feed into the network should be of shape [1, 3, 224, 224]. From your code snippet, you may be using a 'NHWC' format -- try permuting the tensor dimensions to change to 'NCHW'.

fangchangma commented 5 years ago

My image is 244x244 and i'm giving the right format

Also, the correct image size is 224 x 224, not 244 x 244

GustavoCamargoRL commented 5 years ago

Oh thanks! it worked, just one more problem, the results were these : Figure_1

I placed my input code in the "args.evaluate" if condition, and then saved my results in a ply file, so my question is if there is any pos processing missing for the correct prediction of the depth map that i forgoted to do, or it just didn't work for this image.

fangchangma commented 5 years ago

Have you divided the input RGB values by 255.0, as in this line? https://github.com/dwofk/fast-depth/blob/b1266da66ed2beb192e6bffe875158beb7334b76/dataloaders/nyu.py#L56

GustavoCamargoRL commented 5 years ago

Not exactly like this. This is my input code :

    img = plt.imread("img.jpg")/255.  #normalization
    img = np.reshape(img, (3, 224, 224))
    img = np.expand_dims(img, axis=0)
    print(img.shape)
    with torch.no_grad():
        pred = model(torch.from_numpy(img).float().cuda())
        np.save('pred.npy', pred.cpu())

    print(pred)
    import sys
    sys.exit(0)

fangchangma commented 5 years ago

img = np.reshape(img, (3, 224, 224))

I believe it should be permutation of dimensions here, rather than reshaping (which breaks the data ordering). Please try img = np.transpose(img, (2,0,1)) and see if it makes a difference.

GustavoCamargoRL commented 5 years ago

It worked much better! I will try better results with other images. Thanks for the help! Figure_2

mathmax12 commented 4 years ago

Thanks for the work @dwofk @fangchangma. I am trying the same thing as @GustavoCamargoRL did.

   while True:  
        image_cuda = torch.from_numpy(img).float().cuda()
        pred = 0
        print(pred)
        with torch.no_grad():
            pred = model(image_cuda)
            #np.save('pred.npy', pred.cpu())
        print(pred)

The output from the first iteration looks good. But at each iteration, the output is different from the output of other iterations even with the same input image (See below pic). If I kill the thread and execute the code each time at the first iteration I will get the same output.

I print the pred values and find that it does differ from the previous iteration even with the same input image and the same model.

Is there anything I missed for using the model?

mathmax12 commented 4 years ago

@GustavoCamargoRL Do you have the same issue?

LulaSan commented 3 years ago

@mathmax12 Have you done this by using tvm apache?

mathmax12 commented 3 years ago

@LulaSan It turns out that this caused by the tvm . the latest tvm solved this

LulaSan commented 3 years ago

@mathmax12 Ok thank you, can I ask you how do you visualize the results? By using their code visualize.py?

mathmax12 commented 3 years ago

You can save the results as https://github.com/dwofk/fast-depth/blob/master/main.py#L98 or using cv2.imshow() to display