Questions about the generating gray depth

zhangtingyu11 commented 10 months ago

Hello, thank you for sharing this incredible work. I am new to depth completion, and I would like to utilize the depth completion results for KITTI 3D object detection images. I'm currently facing some challenges, and I was wondering if you could help me with a few questions regarding the generation process. The yml file has been modified as follows.

# Hardware
seed: 1128
gpus: (0, )
port: 29000
num_threads: 1
no_multiprocessing: True
cudnn_deterministic: False
cudnn_benchmark: True

# Dataset
data_folder: '/home/public/zty/Project/DeepLearningProject/LRRU/kitti/training'
dataset: ['dep', 'gt', 'rgb']
val: 'test_completion'
grid_spot: True
num_sample: 1000
cut_mask: False
max_depth: 80.0
rgb_noise: 0.0
noise: 0.0

hflip: True
colorjitter: True
rotation: True
resize: False
normalize: True
scale_depth: False

val_h: 352
val_w: 1216
random_crop_height: 256
random_crop_width: 1216
train_bottom_crop: True
train_random_crop: True
val_bottom_crop: True
val_random_crop: True
test_bottom_crop: True
test_random_crop: True

# Network
depth_norm: False
dkn_residual: True
summary_name: 'summary'

# Test
test: True
#test_option: 'test_completion'
test_option: 'test_completion'
test_name: 'ben_depth'
tta: False
test_not_random_crop: True
wandb_id_test: ''

prob: 0.5
bc: 16
model: 'model_dcnv2'
test_dir: './pretrained/LRRU_Base'
test_model: './pretrained/LRRU_Base/LRRU_Base.pt'

# Summary
num_summary: 6
save_test_image: True

# Logs
vis_step: 1000
record_by_wandb_online: False
test_record_by_wandb_online: False
save_result_only: False

When runningval.py based on the provided yml file, I can generate an RGB depth map. If I aim to obtain a grayscale depth map similar to the depth completion dataset, I am wondering whether sample['dep'] represents the absolute depth. To clarify, is sample['dep'] the absolute depth, or does it require a transformation to obtain the absolute depth? https://github.com/YufeiWang777/LRRU/blob/02be555f3a5046f4441eafa782194fbd31dbdca0/summary/summary.py#L164-L166
Noticing that the output feature size is 1216 256, matching the random_crop_height and random_crop_width parameters, I observe that depth maps from other methods are 1216 352. How can I obtain a depth map of size 1216 * 352, and does the resizing operation affect performance?
In the KITTI raw dataset, the intrinsic matrix is 3*3, whereas in 3D object detection, the calib file contains P2, a 3*4 matrix (assumed to be the projection matrix). How can I extract the intrinsic matrix from the projection matrix?

Thanks in advance!

zhangtingyu11 commented 10 months ago

Hello, I come again. I read the source code. For question 2, just change the test_random_crop to False and change test_bottom_crop to True, the generated depth map will be 1216 * 352 right? But I also doubt that the model is trained based on the resolution of 1216 * 256 or 1216 * 352? Will the different resolution affect the performance? Secondly, for question 3, I think the intrinsic is not used, because I found that in KITTI dataset __get_item__ function, the code about K is just changing K according to the data augmentation, and K will not change any other variables, do I understand rightly? But I am not sure about question 1 yet, can you help answer this question? Thanks in advance!

yiandjer commented 5 months ago

你好，我又来了。我阅读了源代码。对于问题 2，只需将 False 更改为 True 并更改为 True，生成的深度图将是 1216 352 对吗？但我也怀疑模型是基于 1216 256 还是 1216 * 352 的分辨率进行训练的？不同的分辨率会影响性能吗？其次，对于问题 3，我认为没有使用内在变量，因为我发现在 KITTI 数据集函数中，关于 K 的代码只是根据数据增强来改变 K，K 不会改变任何其他变量，我理解对了吗？但是我还不确定问题 1，你能帮忙回答这个问题吗？提前致谢！test_random_crop``test_bottom_crop``__get_item__ Hi, I'm currently learning the task of Deep Completion, but I'm just getting started, and I have some questions for you about LRRU, could you add me as a friend?

YufeiWang777 / LRRU

Questions about the generating gray depth #10