GAP-LAB-CUHK-SZ / InstPIFu

repository of "Towards High-Fidelity Single-view Holistic Reconstruction of Indoor Scenes" ECCV2022
100 stars 8 forks source link

Demo reconstruction from RGB image - problem with object pose #28

Open cmanszew opened 7 months ago

cmanszew commented 7 months ago

Hi! I was working some time to create a demo mode, for reconstruction from arbitrary image similar to Total3D and Implicit3D. I followed your data pre-processing code but unfortunately it's not always clear, it seems to rely on pre-processed data. I've got something working, but the end result is not correct, or at least the object pose/scene layout is wrong.

The code I wrote is in: https://github.com/cmanszew/InstPIFu/tree/rgb-demo

To run it with 1st demo input from Implicit3D, just run: python experiments.py using the same environment as for the rest of your code.

Below is the comparison of InstPIFu vs Implicit3D from raw RGB input: Screenshot from 2024-01-21 19-35-49

For this input: img

The issue probably lies somwhere in the way I:

  1. Pre-process Input for 3D detection: https://github.com/cmanszew/InstPIFu/blob/33409ffd6fdbe0e8647b04cc324cf248688fab3b/experiments.py#L72
  2. Process output from 3D detection: https://github.com/cmanszew/InstPIFu/blob/33409ffd6fdbe0e8647b04cc324cf248688fab3b/experiments.py#L149

Like I said, this code tries to follow your code, but sometimes there was scaling and other operations perfomed on already pre-processed data, not raw input. I would appreciate any help on this. Perhaps also we could merge this to your repo with my issue fixed after I add bg reconstruction to it =)

HaolinLiu97 commented 7 months ago

Hi, I will check these codes in these several days. The problem may cause by data preprocessing, or caused by pose inaccuracy problem, since inaccurate poses would deteriorate the results significantly.

HaolinLiu97 commented 7 months ago

One problem that I find out is the weight for the 3D object detection. Since this weight is trained on synthetic data. It would be better to use the detection model from the IM3D repository, which is also used by myself to generate detection results on real images such as images from SUNRGBD.

cmanszew commented 6 months ago

So I managed to correct few issues - those suspicious multiplications turn out to be the culprit. I still wonder how your original code, that multiplies the intrinsic camera matrix and box feats by 2, works. With those removed from my code the output looks closer to what I'd expect.

Screenshot from 2024-02-10 19-35-27

After that I tried to use the detection model from IM3D repository but unfortunately the weights are not compatible with your implementation. So one would need to use the code from IM3D and that's where I got stuck, since Im3D processes the output data quite differently.

HaolinLiu97 commented 6 months ago

Hi, I will release my code for im3d inference later.

HaolinLiu97 commented 6 months ago

I have already checked my code using im3d for inferencing, it should be quite the same as the original Im3D code which is here You can check if this can yield correct results. The config file is in ./configs/test_detection.yaml. And remember to modify the weight to the weight downloaded from the IM3D repository. You should be able to obtain good results if the 2D detections are correct. The intrinsic needs to be diviede by 2 only for 3D-Front dataset, because the image in the downsampled image in 3D Future are downsampled by half compared to the original image. It is not neccessary for SUNRGBD dataset. However, if you are using web image, you should need to make sure to input the correct intrinsic of the camera (depending on the camera device). Also, in the dataloader, the intrinsic needs to be scaled to so align with the scaled image (175 line in sunrgbd_dataset.py).

If more information is needed, please feel free to contact me.