Evaluation on YUD - Githubissues

fkluger commented 1 year ago

Hi,

I am trying to reproduce your results on the YUD dataset, but I am unable to get it running.

In your paper, you say that you used a network trained on SU3 for the evaluation on YUD. However, the config/yud.yaml file appears to assume a network trained on NYU instead. I changed it to match su3.yaml instead, but now I get this error:

Traceback (most recent call last):
  File "eval_manhattan.py", line 315, in <module>
    main()
  File "eval_manhattan.py", line 255, in main
    result = model(input_dict)
  File "/home/kluger/miniconda3/envs/deepvp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/kluger/miniconda3/envs/deepvp/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 165, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/home/kluger/miniconda3/envs/deepvp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/kluger/projects/remote/deepvp/vpd/models/vanishing_net.py", line 40, in forward
    x = self.ht(x)
  File "/home/kluger/miniconda3/envs/deepvp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/kluger/projects/remote/deepvp/vpd/models/ht/ht_cuda.py", line 53, in forward
    out = self.ht(x)
  File "/home/kluger/miniconda3/envs/deepvp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/kluger/projects/remote/deepvp/vpd/models/ht/im2ht.py", line 116, in forward
    self.ht_size
  File "/home/kluger/projects/remote/deepvp/vpd/models/ht/im2ht.py", line 53, in forward
    ctx.ht_size[1]
RuntimeError: height_ == height && width_== width INTERNAL ASSERT FAILED at "/home/kluger/projects/remote/deepvp/vpd/models/ht/cpp_im2ht/ht_cuda.cu":38, please report a bug to PyTorch.  image shape and given shape params do not match: (%d x %d vs %d x %d).240320256256

Sounds like there is a problem due to image size mismatch between YUD and SU3, but I don't know the root cause.

Could you kindly help me get this running?

Oh and I may have found a bug in the YUD dataloader. It expects the name of the file with the VP labels to contain the value of C.io.num_nodes: https://github.com/yanconglin/VanishingPoint_HoughTransform_GaussianSphere/blob/4b743cf8042dfe7aead91c988f92abe056fd289c/vpd/datasets.py#L292

Meanwhile, the pre-processing script does not include this value in the file name: https://github.com/yanconglin/VanishingPoint_HoughTransform_GaussianSphere/blob/4b743cf8042dfe7aead91c988f92abe056fd289c/dataset/yud_process.py#L215

The dataloader then naturally throws an error because it can't find the npz file.

yanconglin commented 1 year ago

Hi, Fkluger, thank you for the report. I will spend some time and figure it out. It might take some time since I have moved to another institute and previous logs no longer exist. BTW, are you in a rush?

Yancong

fkluger commented 1 year ago

Thanks for taking the time! I am in a bit of a rush. If you could look into it within the next two weeks I would really appreciate it.

I have now tried using the mappings from NYU, with the following config file:

io:
  # tensorboard_port: 0
  resume_from:
  logdir: logs/
  dataset: YUD
  datadir: dataset/yud_data
  ht_mapping: parameterization/nyu/ht_240_320_403_180.npz
  sphere_mapping: parameterization/nyu/sphere_neighbors_403_180_32768.npz
  num_workers: 2
  focal_length: 1
  num_neighbors: 16
  num_vpts: 3
  num_nodes: 32768
  percentage: 1.0

model:
  batch_size: 4
  backbone: stacked_hourglass
  depth: 4
  num_stacks: 1
  num_blocks: 1
  lpos: 1.0
  lneg: 1.0
  num_channels: 64

optim:
  name: Adam
  lr: 4.0e-4
  amsgrad: True
  weight_decay: 1.0e-5
  max_epoch: 36
  lr_decay_epoch: 24

and run the evaluation script with the SU3 checkpoint:

python eval_manhattan.py -d 0 -o tmp/yud/result.npz  config/yud.yaml  pretrained_models/SU3/checkpoint_latest.pth.tar

This works, but the results are worse than in your paper:

27.51 | 56.63 | 65.17 | 73.11

I have also tried using the correct focal length (focal_length: 2.10912) for YUD but the difference is negligible:

27.71 | 56.69 | 65.21 | 73.13

Is there anything I am missing? Thanks again!

yanconglin commented 1 year ago

I will have to look into the details because there is some rescaling (in terms of image dimensions and focal length) when doing SU3-YUD (if I recall correctly). Say train on SU3 and then test on YUD:

rescale YUD from (640x480) to SU3 size (512, 512)
load the SU3 mappings and weights
pass through the network then you get predictions (from 512x512)
rescale those results to (640x480)
do the evaluation in the camera space.

I will try to find out the scripts. it seems they are not included in the initial repo. But unfortunately, I can not make any promises.

fkluger commented 1 year ago

Thanks, that's a good pointer. I have now tried the following:

First, resize the images before feeding them into the network:

images = torch.nn.functional.interpolate(images, size=(512, 512))

After prediction, de-normalize the VPs using SU3's focal length, then correct the aspect ratio, and finally normalize the VPs again:

x = vpt_pd[:, 0] / vpt_pd[:, 2] * 2.1875 * 512 / 2.0
y = vpt_pd[:, 1] / vpt_pd[:, 2] * 2.1875 * 512 / 2.0
x = x * 640.0/512.0
y = y * 480.0/512.0
x /= 320.0
y /= 320.0
vpt_pd = np.stack([x, y, np.ones_like(x)], axis=-1)
vpt_pd /= np.linalg.norm(vpt_pd, axis=-1, keepdims=True)

This gives me these results:

28.99 | 65.61 | 76.52 | 85.52

Even a bit better than the paper for AA@3 and AA@5, but slightly worse for AA@10

yanconglin commented 1 year ago

Glad you made it work. As you probably have figured out, the scaling/mapping thing is somehow resolution-dependent, which is unfortunately a drawback of this solution.

https://github.com/yanconglin/VanishingPoint_HoughTransform_GaussianSphere/issues/7#issue-1587778077 Oh, forgot to mention that: I did multiple cross-dataset tests: train/test on SU3/ScanNet/NYU and YUD. Only the NYU(train)/YUD(test) is released. That is what you see in the yud.yaml file.

yanconglin commented 1 year ago

feel free to reopen if you have further questions.

ericzzj1989 commented 2 weeks ago

Hi,

I am trying to reproduce your results on the YUD dataset.

How to address this problem?

In the YUD dataloader, it expects the name of the file with the VP labels to contain the value of C.io.num_nodes:

https://github.com/yanconglin/VanishingPoint_HoughTransform_GaussianSphere/blob/4b743cf8042dfe7aead91c988f92abe056fd289c/vpd/datasets.py#L292

yanconglin commented 2 weeks ago

oops, indeed, a bug when releasing the code, as pointed out by Fkluger. the solution is to remove C.io.num_nodes and load iname.replace(".png", ".npz"), if my memory serves me right. fixed I think (not verified yet)

https://github.com/yanconglin/VanishingPoint_HoughTransform_GaussianSphere/blob/4b743cf8042dfe7aead91c988f92abe056fd289c/dataset/yud_process.py#L215

Hi,

I am trying to reproduce your results on the YUD dataset, but I am unable to get it running.

In your paper, you say that you used a network trained on SU3 for the evaluation on YUD. However, the config/yud.yaml file appears to assume a network trained on NYU instead. I changed it to match su3.yaml instead, but now I get this error:
Traceback (most recent call last):
  File "eval_manhattan.py", line 315, in <module>
    main()
  File "eval_manhattan.py", line 255, in main
    result = model(input_dict)
  File "/home/kluger/miniconda3/envs/deepvp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/kluger/miniconda3/envs/deepvp/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 165, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/home/kluger/miniconda3/envs/deepvp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/kluger/projects/remote/deepvp/vpd/models/vanishing_net.py", line 40, in forward
    x = self.ht(x)
  File "/home/kluger/miniconda3/envs/deepvp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/kluger/projects/remote/deepvp/vpd/models/ht/ht_cuda.py", line 53, in forward
    out = self.ht(x)
  File "/home/kluger/miniconda3/envs/deepvp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/kluger/projects/remote/deepvp/vpd/models/ht/im2ht.py", line 116, in forward
    self.ht_size
  File "/home/kluger/projects/remote/deepvp/vpd/models/ht/im2ht.py", line 53, in forward
    ctx.ht_size[1]
RuntimeError: height_ == height && width_== width INTERNAL ASSERT FAILED at "/home/kluger/projects/remote/deepvp/vpd/models/ht/cpp_im2ht/ht_cuda.cu":38, please report a bug to PyTorch.  image shape and given shape params do not match: (%d x %d vs %d x %d).240320256256
Sounds like there is a problem due to image size mismatch between YUD and SU3, but I don't know the root cause.

Could you kindly help me get this running?

Oh and I may have found a bug in the YUD dataloader. It expects the name of the file with the VP labels to contain the value of C.io.num_nodes:

https://github.com/yanconglin/VanishingPoint_HoughTransform_GaussianSphere/blob/4b743cf8042dfe7aead91c988f92abe056fd289c/vpd/datasets.py#L292

Meanwhile, the pre-processing script does not include this value in the file name:

https://github.com/yanconglin/VanishingPoint_HoughTransform_GaussianSphere/blob/4b743cf8042dfe7aead91c988f92abe056fd289c/dataset/yud_process.py#L215

The dataloader then naturally throws an error because it can't find the npz file.

ericzzj1989 commented 1 week ago

Thanks for your explanation!

I have followed NeurVPS to process ScanNet dataset. I found that there are only vanishing points in the dataset, without labeled line segments as YUD dataset. I wonder if I processed ScanNet correctly, and if not, how do I get the labeled line segments?

Could you kindly point out this issue?

Many thanks!

yanconglin commented 1 week ago

Please check the NeurVPS paper. VPs in this dataset is calculated from (coarse) surface normals and have no corresponding lines. That is also why those line-based vp-detectors perform so much worse.

ericzzj1989 commented 1 week ago

Thanks for your explanation!

Just to clarify, when you tested line-based VP detectors, such as Quasi-VP, which line segments did you use? Were they obtained through an algorithm like LSD, or was another method used to extract the lines?

Many thanks for your help!

yanconglin commented 1 week ago

Lines are extracted from LSD. As far as I remember, Quasi-VP sometimes fails to produce any outputs due to the lack of sufficient line segments. I had to manually remove those samples to get a result.

ericzzj1989 commented 1 week ago

Thank you for clarifying! That makes sense.

Do you still happen to have the line segments extracted from ScanNet? If they are available, it would be incredibly helpful for my experiments.

Thanks for your assistance!

yanconglin commented 1 week ago

unfortunately no. I do not have any source data/checkpoints available anymore for other models.

ericzzj1989 commented 1 week ago

Thanks!

Would you have any recommendations on how I might best replicate your process for extracting line segments and testing with line-based VP detectors? Any guidance would be greatly appreciated!

Thanks for all your help!

yanconglin commented 1 week ago

When working on this project, the source code for Quasi-VP has not yet been released. I emailed the author for the source code (C++). Regarding line extraction, I use the commonly used LSD, which you can find a link in the readme file. It may take you some time to merge the two components and make it work.

ericzzj1989 commented 1 week ago

Got it, thank you very much for all the helpful information and guidance!

ericzzj1989 commented 1 week ago

I have one more question regarding the SU3 dataset. Could you provide some insights into the data structure within this dataset? For example, are the .jlk files used for storing line information? Also, is there any documentation specifically describing the data structure, aside from the main paper?

Thank you again for your time and help!

yanconglin commented 6 days ago

Please refer to the NeurVPS repo. I did not do an in-depth study.

yanconglin / VanishingPoint_HoughTransform_GaussianSphere

Evaluation on YUD #7