open-mmlab / mmdetection3d

OpenMMLab's next-generation platform for general 3D object detection.
https://mmdetection3d.readthedocs.io/en/latest/
Apache License 2.0
5.16k stars 1.53k forks source link

[Bug] create_data.py waymo: the script gets killed whlie dumping the _info.pkl file #2785

Open ammaryasirnaich opened 10 months ago

ammaryasirnaich commented 10 months ago

Prerequisite

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

main branch https://github.com/open-mmlab/mmdetection3d

Environment

Python: 3.8.10 (default, Mar 13 2023, 10:26:41) [GCC 9.4.0] CUDA available: True numpy_random_seed: 2147483648 GPU 0: NVIDIA GeForce RTX 3080 CUDA_HOME: /usr/local/cuda NVCC: Cuda compilation tools, release 12.1, V12.1.66 GCC: x86_64-linux-gnu-gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0 PyTorch: 2.1.0a0+fe05266 PyTorch compiling details: PyTorch built with:

TorchVision: 0.15.0a0 OpenCV: 4.6.0 MMEngine: 0.9.0 MMDetection: 3.0.0 MMDetection3D: 1.2.0+2f0b0bb spconv2.0: True

Reproduces the problem - code sample

create_data.py waymo --root-path /workspace/data/waymo/ --out-dir /workspace/data/waymo/ --workers 11 --extra-tag waymo

Reproduces the problem - command or script

create_data.py waymo --root-path /workspace/data/waymo/ --out-dir /workspace/data/waymo/ --workers 11 --extra-tag waymo

Reproduces the problem - error message

i am preparing the waymo dataset using the below command

create_data.py waymo --root-path /workspace/data/waymo/ --out-dir /workspace/data/waymo/ --workers 11 --extra-tag waymo

and the info generate end the script get killed, i do have changed the --workers to 1, but still the process terminates.

Generate info. this may take several minutes. [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 158081/158081, 70.9 task/s, elapsed: 2230s, ETA: 0s [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 158081/158081, 24.3 task/s, elapsed: 6506s, ETA: 0s Waymo info train file is saved to /workspace/data/waymo/kitti_format/waymo_infos_train.pkl Killed

Additional information

No response

s95huang commented 10 months ago

I am stuck in this waymo-kitti conversion as well.

I used the same command and stuck at the second 158081/158081 line.

Can I ask what CPU did you use

ammaryasirnaich commented 10 months ago

I am using AMD Rayzen 9 3900, but they do have a shared link to download the offline infor.pkl [annotations] for waymo dataset(https://mmdetection3d.readthedocs.io/en/latest/user_guides/dataset_prepare.html#summary-of-annotation-files). Will try to run the model with these files.

s95huang commented 10 months ago

I am stuck at this step

Finished ... created txt files indicating what to collect in ['training', 'validation', 'testing', 'testing_3d_camera_only_detection'] Generate info. this may take several minutes. [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 158081/158081, 60.0 task/s, elapsed: 2636s, ETA: 0s [ ] 0/158081, elapsed: 0s, ETA:

Resource monitor shows very low CPU usage, I don't see any disk write for unknow reason. I am using 13700K. I will try a Ryzen Threadripper later

ammaryasirnaich commented 10 months ago

@s95huang how many workers are you assigning to create the dataset ? while i was debugging the issue, it boiled down to line 200 of kitti_converter.py where the mmengine.dump is called ! not sure why it kills the process!

However, I managed to workout with the given offline info.pkl files, you can use them to save time instead

s95huang commented 10 months ago

@s95huang how many workers are you assigning to create the dataset ? while i was debugging the issue, it boiled down to line 200 of kitti_converter.py where the mmengine.dump is called ! not sure why it kills the process!

However, I managed to workout with the given offline info.pkl files, you can use them to save time instead

Thank you for the information. You used 11 workers and got it to work properly? I am using 10/16 for i7 13700K and 32 for Threadripper 2950

Do you just save https://download.openmmlab.com/mmdetection3d/data/waymo/waymo_mini_kitti_format.tar.gz into the kiiti folder?

ammaryasirnaich commented 10 months ago

Actually, unzip it and give the path to the kitti_format folder which is inside it, do let me know how it went with your Threadripper 2950 machine!

s95huang commented 10 months ago

Actually, unzip it and give the path to the kitti_format folder which is inside it, do let me know how it went with your Threadripper 2950 machine!

Hi, thanks for the reply

the threadripper got stuck at the same place. I also tried with V1.1.0 and the same problem occurs.

ammaryasirnaich commented 10 months ago

Ooh, it's very unfortunate, I am desperately looking for a solution to it. I hope someone can help us.