Closed seagochen closed 1 month ago
Hi, s this because the image was resized to 768*1024 during input, and then resized back to the original resolution after the prediction was completed? In the current implementation, I've noticed that regardless of the input image size, it is forcibly resized to 768*1024.
Hi, s this because the image was resized to 7681024 during input, and then resized back to the original resolution after the prediction was completed? In the current implementation, I've noticed that regardless of the input image size, it is forcibly resized to 7681024.
Thank you for your response. I think there is nothing to do with the resize because you can check my test code, and you will find that the shape is indeed (1, 3, 1024, 768). And I have changed the size to (1, 3, 768, 1024), and the noise still exists. (just like the first picture I pasted here)
As I used a dummy with random values as input, the test code was pasted there and is easy to read. I want to know whether my procedure is correct.
@seagochen Sapiens depth and normal estimators are only supervised on human pixels. In case of non-human pixels, the network predictions can be arbitrary. Although we have seen generalization of models to the backgrounds as well in few cases - however this is not consistent. In your case, inference with noise therefore can result in the grid artifacts due to deconv operations.
Hi, guys.
Thank you for your outstanding contribution to this project. I've noticed a problem with the depth map; white regular noise appears. I need your help to resolve this.
First, I pasted the simple source code here.
And here is the output
Therefore, when I use this model to estimate the depth I got the real-output like this one
Whatever picture I use, there is always noise generated around the person. Your demo in the hugging face seems normal. Therefore, I think something is wrong I have made.