No such file or directory: '/root/sda2/HEAL/data/OPV2V/train/2021_09_10_12_07_11/216/000309_depth0.png'

Lesliewsq commented 4 months ago

Hi，I have a problem when I train on OPV2V dataset. python opencood/tools/train.py -y opencood/hypes_yaml/opv2v/MoreModality/2_modality_end2end_training/lidar_camera_coalign.yaml

FileNotFoundError: [Errno 2] No such file or directory: '/root/sda2/HEAL/data/OPV2V/train/2021_09_10_12_07_11/216/000309_depth0.png'

yifanlu0227 commented 4 months ago

Hi, you need to download the OPV2V-H dataset, which includes the depth file.

Lesliewsq commented 4 months ago

I have downloaded the original OPV2V dataset and created a dataset folder under HEAL as follow. Should I put OPV2V-H-depth.zip and OPV2V-H-LiDAR.zip under /HEAL/OPV2V/ and then unzip them or just put them under a new folder and then unzip them?

├── OPV2V │ ├── additional │ ├── test │ ├── train │ └── validate

yifanlu0227 commented 4 months ago

Please unzip the OPV2V-H-depth.zip and OPV2V-H-LiDAR.zip outside of OPV2V folder. They will produce a parallel folder OPV2V-H, just as the README illustrated.

Lesliewsq commented 4 months ago

Still the same error.
1710411995805 1710411947087

xhjy2020 commented 4 months ago

Still the same error.

+1, It seems that the default configuration file does not point to depth data.

Lesliewsq commented 4 months ago

Still the same error.

+1, It seems that the default configuration file does not point to depth data. Yeah，I found that the camera.png and depth.png data are read from the same path, but aren't they placed in two different datasets?

yifanlu0227 commented 4 months ago

I'm very sorry about that. @xhjy2020 @Lesliewsq Please modify line 150 in opencood/data_utils/datasets/basedataset/opv2v_basedataset.py to

                    depth_files = self.find_camera_files(cav_path, 
                                                timestamp, sensor="depth")
                    depth_files = [depth_file.replace("OPV2V", "OPV2V_Hetero") for depth_file in depth_files]

There are other fixes for OPV2V-H dataset file reading. Please use the latest commit via

git fetch origin
git merge origin/main

yifanlu0227 commented 4 months ago

This is because I convert the RGB image & depth image into a hdf5 file for fast loading. In this case, I do not load the depth image from OPV2V_Hetero folder and omit the error.

Please note that it's OPV2V_Hetero not OPV2V-H, and I just pushed another commit to fix my typo.

@xhjy2020 @Lesliewsq

yifanlu0227 commented 4 months ago

By the way, you can also speed up the image loading by storing all images in an HDF5 format. They are larger but facilitate your training speed.

You can find the conversion script in opencood/utils/img2hdf5.py

xhjy2020 commented 4 months ago

By the way, you can also speed up the image loading by storing all images in an HDF5 format. They are larger but facilitate your training speed.

You can find the conversion script in opencood/utils/img2hdf5.py

Nice! thank you very much for your patient explanation!!

yifanlu0227 commented 4 months ago

I hope I didn't cause more trouble.

I just pushed the third commit and apologize for my typo again @xhjy2020

                    depth_files = self.find_camera_files(cav_path, 
                                                timestamp, sensor="depth")
                    depth_files = [depth_file.replace("OPV2V", "OPV2V_Hetero") for depth_file in depth_files]

Lesliewsq commented 4 months ago

I have tried this command python opencood/tools/train.py -y opencood/hypes_yaml/opv2v/MoreModality/2_modality_end2end_training/lidar_camera_coalign.yaml , and an error occurred. However，opencood/hypes_yaml/opv2v/MoreModality/2_modality_end2end_training/lidar_camera_attfuse.yaml works.

It seems lidar_camera_coalign.yaml config file has problem with it.

No model_train_init function Traceback (most recent call last): File "opencood/tools/train.py", line 189, in main() File "opencood/tools/train.py", line 118, in main ouput_dict = model(batch_data['ego']) File "/root/miniconda3/envs/CoAlign/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, *kwargs) File "/root/sda2/WSQ/HEAL/opencood/models/heter_model_baseline_ms.py", line 212, in forward cls_preds = self.cls_head(fused_feature) File "/root/miniconda3/envs/CoAlign/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(input, **kwargs) File "/root/miniconda3/envs/CoAlign/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 446, in forward return self._conv_forward(input, self.weight, self.bias) File "/root/miniconda3/envs/CoAlign/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 443, in _conv_forward self.padding, self.dilation, self.groups) RuntimeError: Given groups=1, weight of size [2, 256, 1, 1], expected input[1, 384, 256, 256] to have 256 channels, but got 384 channels instead.

yifanlu0227 commented 4 months ago

Yes, the shrink_header is missing. Please check the latest commit and now it should work. @Lesliewsq

Lesliewsq commented 4 months ago

Thanks for the reply. Nice work！ @yifanlu0227

Lesliewsq commented 4 months ago

Both 3_modality and 4_modality yaml files report "channel size mismatch" error. CUDA_VISIBLE_DEVICES=1 python opencood/tools/train.py -y opencood/hypes_yaml/opv2v/MoreModality/3_modality_end2end_training/m1m2m3_pyramid.yaml

Traceback (most recent call last): File "opencood/tools/train.py", line 189, in main() File "opencood/tools/train.py", line 118, in main ouput_dict = model(batch_data['ego']) File "/root/miniconda3/envs/CoAlign/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, kwargs) File "/root/sda2/WSQ/HEAL/opencood/models/heter_pyramidcollab.py", line 145, in forward feature = eval(f"self.encoder{modality_name}")(data_dict, modality_name) File "/root/miniconda3/envs/CoAlign/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, *kwargs) File "/root/sda2/WSQ/HEAL/opencood/models/heter_encoders.py", line 79, in forward batch_dict = self.spconv_block(batch_dict) File "/root/miniconda3/envs/CoAlign/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(input, kwargs) File "/root/sda2/WSQ/HEAL/opencood/models/sub_modules/sparse_backbone_3d.py", line 114, in forward x = self.conv_input(input_sp_tensor) File "/root/miniconda3/envs/CoAlign/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, *kwargs) File "/root/miniconda3/envs/CoAlign/lib/python3.7/site-packages/spconv/pytorch/modules.py", line 137, in forward input = module(input) File "/root/miniconda3/envs/CoAlign/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(input, **kwargs) File "/root/miniconda3/envs/CoAlign/lib/python3.7/site-packages/spconv/pytorch/conv.py", line 187, in forward 1] == self.in_channels, "channel size mismatch" AssertionError: channel size mismatch

xhjy2020 commented 3 months ago

same error when i train m3_no_fusion using opencood/hypes_yaml/opv2v/Single/m3_SECOND32_pretrain.yaml. Is this due to the spconv version?

yifanlu0227 commented 3 months ago

Are you using spconv 2.x? Could you provide more error information?

xhjy2020 commented 3 months ago

Are you using spconv 2.x? Could you provide more error information?

yes, the version is spconv-cu117, 2.3.6

xhjy2020 commented 3 months ago

Are you using spconv 2.x? Could you provide more error information?

yes, the version is spconv-cu117, 2.3.6

It is indeed an issue with the spconv version. I replaced 2.3.6 with 1.2.1, then this issue won't occur anymore. Thanks!!

yifanlu0227 / HEAL

No such file or directory: '/root/sda2/HEAL/data/OPV2V/train/2021_09_10_12_07_11/216/000309_depth0.png' #6