TUI-NICR / ESANet

ESANet: Efficient RGB-D Semantic Segmentation for Indoor Scene Analysis
Other
231 stars 49 forks source link

tensor is not a torch image #16

Open 18306125266 opened 3 years ago

18306125266 commented 3 years ago

Hello,I have a new problem. I want to test this model on my samples . I have got the rgb images and depth images .But i can not run the inference_samples.py normally .There report 'tensor is not a torch image' . Can you help me? Thank you ~

abc abc_depth

danielS91 commented 3 years ago

The error description is pretty short. Can you please provide some further information, i.e., environment (conda list / pip list), folder structure, executed command, and full error trace).

18306125266 commented 3 years ago

The error description is pretty short. Can you please provide some further information, i.e., environment (conda list / pip list), folder structure, executed command, and full error trace).

I created the rgbd_segmentation environment and prepared sunrgbd dataset.

Then run inference_sample.py

python inference_samples.py --dataset sunrgbd --ckpt_path ./trained_models/sunrgbd/r34_NBt1D.pth --depth_scale 1 --raw_depth Loaded SUNRGBD dataset without files Loaded SUNRGBD dataset without files /data/nas/workspace/jupyter/bisenetv2/ESANet-main/src/build_model.py:29: UserWarning: Argument --channels_decoder is ignored when --decoder_chanels_mode decreasing is set. warnings.warn('Argument --channels_decoder is ignored when ' /data/nas/workspace/jupyter/bisenetv2/ESANet-main/src/models/resnet.py:101: UserWarning: parameters groups, base_width and norm_layer are ignored in NonBottleneck1D warnings.warn('parameters groups, base_width and norm_layer are ' /data/nas/workspace/jupyter/bisenetv2/ESANet-main/src/models/model.py:163: UserWarning: for the context module the learned upsampling is not possible as the feature maps are not upscaled by the factor 2. We will use nearest neighbor instead. warnings.warn('for the context module the learned upsampling is ' Device: cpu ....... Loaded checkpoint from ./trained_models/sunrgbd/r34_NBt1D.pth Traceback (most recent call last): File "inference_samples.py", line 73, in sample = preprocessor({'image': img_rgb, 'depth': img_depth}) File "/home/admin/.conda/envs/rgbd_segmentation/lib/python3.7/site-packages/torchvision/transforms/transforms.py", line 70, in call img = t(img) File "/data/nas/workspace/jupyter/bisenetv2/ESANet-main/src/preprocessing.py", line 195, in call mean=self._depth_mean, std=self._depth_std)(depth) File "/home/admin/.conda/envs/rgbd_segmentation/lib/python3.7/site-packages/torchvision/transforms/transforms.py", line 175, in call return F.normalize(tensor, self.mean, self.std, self.inplace) File "/home/admin/.conda/envs/rgbd_segmentation/lib/python3.7/site-packages/torchvision/transforms/functional.py", line 209, in normalize raise TypeError('tensor is not a torch image.') TypeError: tensor is not a torch image.

mona0809 commented 3 years ago

Are you able to run inference_sample.py with the provided samples? Are your images successfully read? What is the datatype and the shape of the images before line 73 when the error is thrown?

18306125266 commented 3 years ago

I can run inference_sample.py with the provided samples.  Is it related to bit depth?

I just change the inference_samples.py  line59,60 

run:     python inference_samples.py --dataset sunrgbd --ckpt_path ./trained_models/sunrgbd/r34_NBt1D.pth --depth_scale 1 --raw_depth

Then ,there report errors

Loaded checkpoint from ./trained_models/sunrgbd/r34_NBt1D.pth Traceback (most recent call last):   File "inference_samples.py", line 73, in <module>     sample = preprocessor({'image': img_rgb, 'depth': img_depth})   File "/home/admin/.conda/envs/rgbd_segmentation/lib/python3.7/site-packages/torchvision/transforms/transforms.py", line 70, in call     img = t(img)   File "/data/nas/workspace/jupyter/bisenetv2/ESANet-main/src/preprocessing.py", line 198, in call     mean=self._depth_mean, std=self._depth_std)(depth)   File "/home/admin/.conda/envs/rgbd_segmentation/lib/python3.7/site-packages/torchvision/transforms/transforms.py", line 175, in call     return F.normalize(tensor, self.mean, self.std, self.inplace)   File "/home/admin/.conda/envs/rgbd_segmentation/lib/python3.7/site-packages/torchvision/transforms/functional.py", line 209, in normalize     raise TypeError('tensor is not a torch image.') TypeError: tensor is not a torch image.

The experience exchange paste said that the order of functions caused this error. I referred to this method, but it didn't work.

How can i test my data?Thank you!

danielS91 commented 3 years ago

If you are able to run inference_sample.py with the samples provided by us, the problem seems to be related to your images. Please check that both images are loaded correctly using a breakpoint at line 70. OpenCV is returning None if loading fails without throwing any error.

danielS91 commented 3 years ago

Beyond that, as already mentioned by Mona, we need the dtypes and shapes for both images at this line for further debugging.

18306125266 commented 3 years ago

The provided image_rgb  shape (424,512,3) dtype uint8      image_depth(424,512) dtype float32 my image_rgb shape(424,512,3)  dtype uint8     image_depth (424,512,3) dtype float32

Is  that the problem?Thank you!

These are my images.

danielS91 commented 3 years ago

The problem is related to your depth image - is not a common depth image with depth values encoded in one channel as yours has three channels. It is more like another RGB images with gray values encoding the depth. You should check the depth image.

18306125266 commented 3 years ago

OK,thank you very much!

 

18306125266 commented 3 years ago

I get the result .Thanks for your help!

Now,there have a new question.How can I output semantic information corresponding to different color regions?

mona0809 commented 3 years ago

What do you mean with "different color regions"?

18306125266 commented 3 years ago

 For example ,the orange area refers to the "table",How can i output the information "table"? 

 

mona0809 commented 3 years ago

Before coloring (https://github.com/TUI-NICR/ESANet/blob/main/inference_samples.py#L87), the segmentation contains integers. Each integer refers to one category. For each category there exists a color and a class name as defined here. If you only need the regions for category "table" you can filter the segmentation by the respective integer value.

Shiv1143 commented 1 year ago

The problem is related to your depth image - is not a common depth image with depth values encoded in one channel as yours has three channels. It is more like another RGB images with gray values encoding the depth. You should check the depth image.

I too faced the same issue as third dimension seems to be not encoded properly...so I did some manipulation and it worked