AugmentariumLab / omnisyn

OmniSyn: Synthesizing 360 Videos with Wide-baseline Panoramas (VRW 2022)
https://augmentariumlab.github.io/omnisyn/
13 stars 2 forks source link

Multi-view consistency #6

Closed EchoTHChen closed 2 years ago

EchoTHChen commented 2 years ago

I met a new problem about the predicted depthmap of DepthNet in OmniSyn. I run the depth estimation network to predict depthmaps for two panoramas of one scene in Matterport3d (The baseline is 1.0 meter).

I get the 3D point clouds according to the depthmaps of the two panoramas. But two surface appeared in the 3D pointclouds finally. It seems that it is actually not view-consistent for the predicted depthmaps. But the cost-volume has been used in the code. I do not know why it occurs. Could you give me some suggestions for resolving the issue?

snapshot01 snapshot02
dli7319 commented 2 years ago

Hello EchoTHChen,

A bug in the rectify function in models/pipeline3_model.py (d747478e2a) caused the cost volume to not work for one of the panos. I think this is what caused the issue. That said, I don't think the depth even with this fixed is good enough for 3D modeling so keep that in mind.

I also added some point cloud visualization code which you can try with

python train_inpainting.py -c example/config_example_m3d.txt

Point Cloud with GT depth: Screenshot from 2022-08-10 13-10-15 Point Cloud w/ predicted depth before bug fix: Screenshot from 2022-08-10 13-08-02 Point Cloud w/ predicted depth after bug fix: Screenshot from 2022-08-10 13-08-47

EchoTHChen commented 2 years ago

It seems that you only revised the code in Line 1968 rectify_images function. Do I need to retrain the depth estimation network from scatch? Or I just use the pretrained model(before you fix the bug) to re-inference with the revised code?

dli7319 commented 2 years ago

You don't need to retrain.

EchoTHChen commented 2 years ago

I'm new to panorama depth esimation. Why rectify_images is needed?

dli7319 commented 2 years ago

For the cost volume, I need to project pixels from one panorama to another. For this you need the pose of each panorama.

In my code, I decided to first rotate the panoramas so I know one panorama is in front of another so it is easier to code the cost volume portion. In retrospect, it's probably better to just multiply each pixel through the transformation matrix for each panorama. When that function was not working, the projection is correct for one image but backward for the second so the (deep features) cost volume did not line up properly when I swapped the two images:

correct cost volume visualization using RGB values: cv

incorrect rectification: cv_bad

EchoTHChen commented 2 years ago

Thanks

EchoTHChen commented 2 years ago

I came across another issue after I use the updated code to generate the point clouds: The given code generates normal point clouds for the scene in the validation set (The visualize_point_cloud function is tested in the validation set data). But it fails to generate the accurate point clouds in the test dataset (the test part of HabitatImageGenerator in run_inpainting_single function):

image

, which is worse than the result of the previous code.

The Ground Truth is as follows: image

Do you know how to fix this bug?

dli7319 commented 2 years ago

Hey there,

I'm having trouble reproducing the issue. Can you please provide the code you used to get that? Also, are you using my weights or your own trained weights?

This is what I get using this code in the visualization_point_cloud function which I think gets the same scene as what you have:

# After line 1673
    seq_len = 3
    reference_idx = 1
    test_data = HabitatImageGenerator(
      "test",
      full_width=self.full_width,
      full_height=self.full_height,
      seq_len=seq_len,
      reference_idx=reference_idx,
      m3d_dist=1.0
    )

    val_dataloader = DataLoader(test_data,
                            batch_size=1,
                            shuffle=False,
                            num_workers=0 if args.dataset == 'm3d' else 4)

Predicted: Screenshot from 2022-08-15 04-15-12 Screenshot from 2022-08-15 04-15-14

GT: Screenshot from 2022-08-15 04-16-08

EchoTHChen commented 2 years ago

Now, I generate almost the same point clouds as yours by using the given visulize_point_cloud function. But I find that I fail to produce normal inpainted results by running python train_inpainting.py --config configs/m3d_inpainting_config.txt \ --use-pred-depth True \ --checkpoints-dir runs/run_m3d_depth \ --cost-volume v3_erp \ --depth-input-uv True \ --model-use-v-input True \ --script-mode run_inpainting_single

1 2 3

EchoTHChen commented 2 years ago

I think there are some problems in the run_inpaint_single function even given a normal depth estimation.

EchoTHChen commented 2 years ago

I'm using my own trained weights.

dli7319 commented 2 years ago

Does it work for you using the example config and weights?

python train_inpainting.py --config example/config_example_m3d.txt \
    --use-pred-depth True \
    --script-mode run_inpainting_single

If so, maybe you can check your trained depth model w/ our inpainting model or your inpainting w/ our depth to figure out which one isn't working?

It should look like this: 2

EchoTHChen commented 2 years ago

The example config works. Thanks!

---Original--- From: "David @.> Date: Mon, Aug 15, 2022 17:36 PM To: @.>; Cc: @.>;"State @.>; Subject: Re: [AugmentariumLab/omnisyn] Multi-view consistency (Issue #6)

Does it work for you using the example config and weights? python train_inpainting.py --config example/config_example_m3d.txt \ --use-pred-depth True \ --script-mode run_inpainting_single
If so, maybe you can check your trained depth model w/ our inpainting model or your inpainting w/ our depth to figure out which one isn't working?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.***>