duanyiqun / DiffusionDepth

PyTorch Implementation of introducing diffusion approach to 3D depth perception ECCV 2024
https://arxiv.org/abs/2303.05021
Apache License 2.0
310 stars 17 forks source link

Questions about dataset and results #9

Open Sec996 opened 1 year ago

Sec996 commented 1 year ago

Hi there!Thanks for the great work!

I have some questions about data sets and training results.

The first is the kitti dataset. I see in the README file that you use the raw portion of the kitti dataset. When I downloaded the raw part, I found that there are many kinds of RAW parts classified by date on the official website. It is very troublesome to click one by one to download. I would like to ask if you can provide some scripts to support one-click download. I clicked for a long time to download the dataset of all the dates, but found that it was about 180 gigabytes. I would like to know what parts of raw copy to kitti data set you used, can I download it specifically?

The second problem is that I have successfully trained the NYU-V2 data set, but the final result is not so good. When the epoch is 20, test_RMSE is about 0.50. I want to know if I made a mistake using the command. Can you provide commands about the NYU data set.

(patch_height 340 --patch_width 512 -loss 1.0L1+1.0L2+1.0*DDIM --epochs 30 --batch_size 16 --max_depth 10.0 --save NAME_TO_SAVE --model_name DiffusionDCbase --backbone_module swin --backbone_name swin_large_naive_l4w722422k --head_specify DDIMDepthEstimate_Swin_ADDHAHI)

Thanks again for your open source sharing!

duanyiqun commented 1 year ago

Hi there,

  1. There may have some open-source work that provides download scripts. You might be interested to search these scripts. I didn't count how many raw the depth proportion used, please download them all for training.
  2. Do you have the related val set RSME corresponding to this result?If the Val set is ok. I may suggest adjusting the patch size directly in nyu_data.py
Sec996 commented 1 year ago

Thank you for your time and reply! 1.I still have a little trouble downloading raw data sets, 180 gigabytes of raw data is a lot, and there are about 70 different application scenarios, which is a lot of work to download separately. I would appreciate it if you could write a few more steps on the kitti data configuration, just like the nyu data set.

2.The precision of the Val set is ok. But the accuracy of the test is not so good. I see you are... src/data/ nu.py code says (For NYUDepthV2, crop size is fixed)height, width = (240, 320) crop_size = (228, 304), But in the model details of the paper it says crop size is 512x340. I want to know which one I should use as the patch width and height. If crop size in the code is fixed, does it not need to mark crop size as 512x340 on the command line when training NYU data set?

duanyiqun commented 1 year ago

For the second question is it ok to change the crop size from 228 to 240 first and observe the results?

PhoenixZ810 commented 1 year ago

I'm stuck in building mmcv environment. Can you please tell me the version of your torch, torchvision, mmcv, mmdet and mmdet3d?I really appreciate it.

zyp-byte commented 1 year ago

I'm stuck in building mmcv environment. Can you please tell me the version of your torch, torchvision, mmcv, mmdet and mmdet3d?I really appreciate it.

I met the same situation with you. Could you please show me how did you fix it? Thanks!

seeker02 commented 1 year ago

Hi there!Thanks for the great work!

I have some questions about data sets and training results.

The first is the kitti dataset. I see in the README file that you use the raw portion of the kitti dataset. When I downloaded the raw part, I found that there are many kinds of RAW parts classified by date on the official website. It is very troublesome to click one by one to download. I would like to ask if you can provide some scripts to support one-click download. I clicked for a long time to download the dataset of all the dates, but found that it was about 180 gigabytes. I would like to know what parts of raw copy to kitti data set you used, can I download it specifically?

The second problem is that I have successfully trained the NYU-V2 data set, but the final result is not so good. When the epoch is 20, test_RMSE is about 0.50. I want to know if I made a mistake using the command. Can you provide commands about the NYU data set.

(patch_height 340 --patch_width 512 -loss 1.0_L1+1.0_L2+1.0*DDIM --epochs 30 --batch_size 16 --max_depth 10.0 --save NAME_TO_SAVE --model_name DiffusionDCbase --backbone_module swin --backbone_name swin_large_naive_l4w722422k --head_specify DDIMDepthEstimate_Swin_ADDHAHI)

Thanks again for your open source sharing!

Why does the first epoch suddenly turn into nan halfway through training nyudepth, loss_sum causing Loss to become nan? What's your commond to train the model on NYU Depth

seeker02 commented 1 year ago

Why does the first epoch suddenly turn into nan halfway through training nyudepth, loss_sum causing Loss to become nan?

seeker02 commented 1 year ago

How to train nyudepth with a single GPU

Huskie377 commented 1 year ago

嘿,你好!感谢您的出色工作!

我对数据集和训练结果有一些疑问。

第一个是kitti数据集。我在自述文件中看到您使用 kitti 数据集的原始部分。当我下载原始部分时,我发现官方网站上有很多种按日期分类的RAW零件。逐个点击下载非常麻烦。我想问一下您是否可以提供一些脚本来支持一键下载。我点击了很长时间下载所有日期的数据集,但发现它大约是 180 GB。我想知道您使用了原始副本到kitti数据集的哪些部分,我可以专门下载吗?

第二个问题是我已经成功训练了NYU-V2数据集,但最终结果并不那么好。当纪元为 20 时,test_RMSE约为 0.50。我想知道我是否使用该命令犯了错误。你能提供有关纽约大学数据集的命令吗?

(patch_height 340 --patch_width 512 -损失 1.0_L1+1.0_L2+1.0*DDIM --纪元 30 --batch_size 16 --max_depth 10.0 --保存 NAME_TO_SAVE --model_name DiffusionDCbase --backbone_module swin --backbone_name swin_large_naive_l4w722422k --head_specify DDIMDepthEstimate_Swin_ADDHAHI)

再次感谢您的开源分享!

嘿,你好!感谢您的出色工作!

我对数据集和训练结果有一些疑问。

第一个是kitti数据集。我在自述文件中看到您使用 kitti 数据集的原始部分。当我下载原始部分时,我发现官方网站上有很多种按日期分类的RAW零件。逐个点击下载非常麻烦。我想问一下您是否可以提供一些脚本来支持一键下载。我点击了很长时间下载所有日期的数据集,但发现它大约是 180 GB。我想知道您使用了原始副本到kitti数据集的哪些部分,我可以专门下载吗?

第二个问题是我已经成功训练了NYU-V2数据集,但最终结果并不那么好。当纪元为 20 时,test_RMSE约为 0.50。我想知道我是否使用该命令犯了错误。你能提供有关纽约大学数据集的命令吗?

(patch_height 340 --patch_width 512 -损失 1.0_L1+1.0_L2+1.0*DDIM --纪元 30 --batch_size 16 --max_depth 10.0 --保存 NAME_TO_SAVE --model_name DiffusionDCbase --backbone_module swin --backbone_name swin_large_naive_l4w722422k --head_specify DDIMDepthEstimate_Swin_ADDHAHI)

再次感谢您的开源分享!

嘿,你好!感谢您的出色工作!

我对数据集和训练结果有一些疑问。

第一个是kitti数据集。我在自述文件中看到您使用 kitti 数据集的原始部分。当我下载原始部分时,我发现官方网站上有很多种按日期分类的RAW零件。逐个点击下载非常麻烦。我想问一下您是否可以提供一些脚本来支持一键下载。我点击了很长时间下载所有日期的数据集,但发现它大约是 180 GB。我想知道您使用了原始副本到kitti数据集的哪些部分,我可以专门下载吗?

第二个问题是我已经成功训练了NYU-V2数据集,但最终结果并不那么好。当纪元为 20 时,test_RMSE约为 0.50。我想知道我是否使用该命令犯了错误。你能提供有关纽约大学数据集的命令吗?

(patch_height 340 --patch_width 512 -损失 1.0_L1+1.0_L2+1.0*DDIM --纪元 30 --batch_size 16 --max_depth 10.0 --保存 NAME_TO_SAVE --model_name DiffusionDCbase --backbone_module swin --backbone_name swin_large_naive_l4w722422k --head_specify DDIMDepthEstimate_Swin_ADDHAHI)

再次感谢您的开源分享!

Hi there!Thanks for the great work!

I have some questions about data sets and training results.

The first is the kitti dataset. I see in the README file that you use the raw portion of the kitti dataset. When I downloaded the raw part, I found that there are many kinds of RAW parts classified by date on the official website. It is very troublesome to click one by one to download. I would like to ask if you can provide some scripts to support one-click download. I clicked for a long time to download the dataset of all the dates, but found that it was about 180 gigabytes. I would like to know what parts of raw copy to kitti data set you used, can I download it specifically?

The second problem is that I have successfully trained the NYU-V2 data set, but the final result is not so good. When the epoch is 20, test_RMSE is about 0.50. I want to know if I made a mistake using the command. Can you provide commands about the NYU data set.

(patch_height 340 --patch_width 512 -loss 1.0_L1+1.0_L2+1.0*DDIM --epochs 30 --batch_size 16 --max_depth 10.0 --save NAME_TO_SAVE --model_name DiffusionDCbase --backbone_module swin --backbone_name swin_large_naive_l4w722422k --head_specify DDIMDepthEstimate_Swin_ADDHAHI)

Thanks again for your open source sharing!

Hello! You say you have successfully trained the NYUv2,could you tell me your environment about the version MMDetection3D,MMDectection and MMSegmentation,mmcv-full? I have been puzzled by this version problem for a long time and i hope to receive a reply. Thanks.

unlugi commented 1 year ago

@Huskie377 It was really tricky for me to install as well. The setup that worked for me in the end was:

python 3.8 numpy 1.19.5 numba 0.48.0 mmcv-full 1.3.13
mmdet 2.14.0
mmdet3d 0.15.0
mmsegmentation 0.14.1 pytorch 1.10 cuda 11.3

For MM* libraries, since these are all older versions of the libraries - I had to clone the repos and git checkout older branches and install myself. I couldn;t use pip install or conda install directly.

Be careful about Apex and installing it with cuda - sometimes if there is a mismatch with local cuda and the cuda that was used to install pytorch with (if the minor versions do not match eg local cuda 11.4 - pytorch cuda 11.2) apex won't build with cuda. Check out this issue: https://github.com/NVIDIA/apex/pull/323#discussion_r287021798

Installing is really tricky, but it works if you install everything carefully and make sure the versions between libraries are compatible, Good luck.

YangChenUcas commented 7 months ago

Hi there!Thanks for the great work!

I have some questions about data sets and training results.

The first is the kitti dataset. I see in the README file that you use the raw portion of the kitti dataset. When I downloaded the raw part, I found that there are many kinds of RAW parts classified by date on the official website. It is very troublesome to click one by one to download. I would like to ask if you can provide some scripts to support one-click download. I clicked for a long time to download the dataset of all the dates, but found that it was about 180 gigabytes. I would like to know what parts of raw copy to kitti data set you used, can I download it specifically?

The second problem is that I have successfully trained the NYU-V2 data set, but the final result is not so good. When the epoch is 20, test_RMSE is about 0.50. I want to know if I made a mistake using the command. Can you provide commands about the NYU data set.

(patch_height 340 --patch_width 512 -loss 1.0_L1+1.0_L2+1.0*DDIM --epochs 30 --batch_size 16 --max_depth 10.0 --save NAME_TO_SAVE --model_name DiffusionDCbase --backbone_module swin --backbone_name swin_large_naive_l4w722422k --head_specify DDIMDepthEstimate_Swin_ADDHAHI)

Thanks again for your open source sharing!

Hi, I meet the same situation as the second problem you mentioed. Have you solved it ? If possible, could you please share me your commands about the training on NYU dataset.