Open sunnyHelen opened 2 years ago
You should set front_camera_id
as 0 for KITTI.
https://github.com/zhyever/SimIPU/blob/5b346e392c161a5e9fdde09b1692656bc7cd3faf/tools/data_converter/kitti_converter.py#L292
:D Since the released codes are only supporting pre-training on KITTI, data preparation is similar to standard mmdet3d. So, you can utilize the standard mmdet3d (correct version introduced in README.md) to run create_data.py
and then link the prepared data to the simipu repo.
Thank you for your quick reply. when I create GT Database of KittiDataset [ ] 0/3712, elapsed: 0s, ETA:Traceback (most recent call last): File "tools/create_data.py", line 247, in out_dir=args.out_dir) File "tools/create_data.py", line 44, in kitti_data_prep with_bbox=True) # for moca File "/mnt/lustre/chenzhuo1/hzha/SimIPU/tools/data_converter/create_gt_database.py", line 275, in create_groundtruth_database P0 = np.array(example['P0']).reshape(4, 4) KeyError: 'P0' https://github.com/zhyever/SimIPU/blob/5b346e392c161a5e9fdde09b1692656bc7cd3faf/tools/data_converter/create_gt_database.py#L275 It seems no P0 key. And there are some different places compared with the mmdet3d one. How should I properly creat the data?
Sorry that I missed your problems since I was busy recently. There is a problem with my last answer. You should set front_camera_id=2
.
Actually, I recommend that you clone the mmdet3d and utilize the official codes to generate the KITTI dataset. You can directly link the mmdet3d-generated KITTI to the SimIPU repo.
Got it. Thanks for your reply.
But I encounter a problem when I attempt to conduct Camera-lidar fusion-based 3D object detection on kitti dataset. I follow your instruction to do that: bash tools/dist_train.sh project_cl/configs/kitti_det3d/moca_r50_kitti.py 8 --work-dir work_dir/
But there is a problem when loading data. Does it seem related to the data label? Could please help me?
Original Traceback (most recent call last): File "/mnt/cache/chen/anaconda3/envs/simipu/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop data = fetcher.fetch(index) File "/mnt/cache/chen/anaconda3/envs/simipu/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/mnt/cache/chen/anaconda3/envs/simipu/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/mnt/lustre/chen/hzha/mmdetection/mmdet/datasets/dataset_wrappers.py", line 151, in getitem return self.dataset[idx % self._ori_len] File "/mnt/lustre/chen/hzha/SimIPU/mmdet3d/datasets/custom_3d.py", line 387, in getitem data = self.prepare_train_data(idx) File "/mnt/lustre/chen/hzha/SimIPU/mmdet3d/datasets/kitti_dataset.py", line 122, in prepare_train_data example = self.pipeline(input_dict) File "/mnt/lustre/chen/hzha/mmdetection/mmdet/datasets/pipelines/compose.py", line 40, in call data = t(data) File "/mnt/lustre/chen/hzha/SimIPU/mmdet3d/datasets/pipelines/transforms_3d.py", line 185, in call img=img) File "/mnt/lustre/chen/hzha/SimIPU/mmdet3d/datasets/pipelines/dbsampler.py", line 388, in sample_all avoid_coll_boxes_2d) File "/mnt/lustre/chen/hzha/SimIPU/mmdet3d/datasets/pipelines/dbsampler.py", line 546, in sample_class_v2 sp_boxes_2d = np.stack([i['box2d_camera'] for i in sampled], File "/mnt/lustre/chen/hzha/SimIPU/mmdet3d/datasets/pipelines/dbsampler.py", line 546, in sp_boxes_2d = np.stack([i['box2d_camera'] for i in sampled], KeyError: 'box2d_camera'
Oh, this issue is caused by the key of box2d_camera
in dp_sampler
. In 'tools/create_data.py', you can find the calling of create_groundtruth_database
, which is used to generate the sampled objects for data augment. Since we choose the moca as our baseline method, there are tons of modifications to this ground_database generation function.
Hence, if you create the Kitti dataset via the official mmdet3d codebase, I think you should run the create_groundtruth_database
function (comment other lines of code in the kitti_data_prep
function) in SimIPU (or Moca) to create the sampled object dataset. If you have created the sampled object dataset via our codes, but there are still these bugs, please report to me and I will have a check. I run the codes before I push this repo to github, so there should have been OK.
Thanks a lot. I used the official mmdet3d to create the data label before. I'll follow your instruction to run the create_groundtruth_database function.
Hi. I tried to run the create_groundtruth_database function. But it seems we go back to the previous problem:
[ ] 0/3712, elapsed: 0s, ETA:Traceback (most recent call last): File "tools/create_data.py", line 247, in out_dir=args.out_dir) File "tools/create_data.py", line 44, in kitti_data_prep with_bbox=True) # for moca File "/mnt/lustre/chenzhuo1/hzha/SimIPU/tools/data_converter/create_gt_database.py", line 275, in create_groundtruth_database P0 = np.array(example['P0']).reshape(4, 4) KeyError: 'P0'
Let me explain why there are problems. We first conduct experiments on KITTI dataset, where the used images come from the second camera. So, when creating the KITTI, all PX should be P2 (utilize the camera parameters from the second camera). Later, we try to do experiments on Waymo, where the utilized images are in the front view, having a number of 0. Hence, we hack the codes to generate related data with P0.
However, when I push the codes that only support KITTI, I forget to change the data-related codes to the KITTI version. So, you meet problems about KeyError: 'P0'
. For KITTI, just utilize P2. :D
Hi, thanks for your help. I successfully created the label after changing P0-->P2. But the error still exists when: bash tools/dist_train.sh project_cl/configs/kitti_det3d/moca_r50_kitti.py 8 --work-dir work_dir/
Original Traceback (most recent call last):
File "/mnt/cache/chenzhuo1/anaconda3/envs/simipu/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
data = fetcher.fetch(index)
File "/mnt/cache/chenzhuo1/anaconda3/envs/simipu/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/mnt/cache/chenzhuo1/anaconda3/envs/simipu/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
I will have a check from scratch ASAP and update this repo. Btw, that's the problem only for the Moca training (our downstream task on 3D detection). While the gt_sampler does not work, you can still run the SimIPU since our pre-training method does not need any gt information.
Yeah, I've tried the pretraining code, which is totally ok. Thanks for your help.
Hi @zhyever, I am running into the same error (KeyError: 'box2d_camera') for the downstream evaluation on Kitti dataset. Pretraining step does not have any issue. Let me know if there is an update. Thanks for the help!
Hi, is there any new thing about solving the problem?
Sorry for the late.
Download the pkl and the zipped gt_database.
Rename the pkl file to kitti_dbinfos_train.pkl
and put it under your data folder. Unzip the .zip file, rename the folder to kitti_gt_database
, and put it under your data folder.
The result can be like this:
Then, run the training script again.
Thanks a lot for your apply. It seems the data problem is solved. But there are still some problems while training.
Traceback (most recent call last):
File "tools/train.py", line 222, in find_unused_parameters=True
to torch.nn.parallel.DistributedDataParallel
, and by
making sure all forward
function outputs participate in calculating loss.
If you already have done the above, then the distributed data parallel module wasn't able to locate the output tensors in the return value of your module's forward
function. Please include the loss function and the structure of the return value of forward
of your module when reporting this issue (e.g. list, dict, iterable).
Parameter indices which did not receive grad for rank 0: 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129
130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178
179 180 181 182 183 184 185 186 187 188 189 ...
In addition, you can set the environment variable TORCH_DISTRIBUTED_DEBUG to either INFO or DETAIL to print out information about which particular parameters did not receive gradient on this ran
k as part of this error
I tried to pass the keyword argument find_unused_parameters=True
to `torch.nn.parallel.DistributedDataParallel. But it doesn't work.
Set this flag in your config file instead of passing it by the shell.
You can add a line of find_unused_parameters=True
in your config file.
Yes. It works! Many thanks for your help.
Thanks @zhyever. The funetuning on kitti3d detection is resolved now. But there seems to be an error during the evaluation (after 30 epochs). Here is the log for the error.
File "tools/train.py", line 222, in <module>
main()
File "tools/train.py", line 218, in main
meta=meta)
File "/home/bhavya.goyal/Documents/SimIPU/mmdet3d/apis/train.py", line 34, in train_model
meta=meta)
File "/home/bhavya.goyal/Documents/SimIPU/mmdetection/mmdet/apis/train.py", line 170, in train_detector
runner.run(data_loaders, cfg.workflow)
File "/home/bhavya.goyal/miniconda3/envs/simipuenv2/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/bhavya.goyal/miniconda3/envs/simipuenv2/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 54, in train
self.call_hook('after_train_epoch')
File "/home/bhavya.goyal/miniconda3/envs/simipuenv2/lib/python3.7/site-packages/mmcv/runner/base_runner.py", line 307, in call_hook
getattr(hook, fn_name)(self)
File "/home/bhavya.goyal/Documents/SimIPU/mmdetection/mmdet/core/evaluation/eval_hooks.py", line 279, in after_train_epoch
key_score = self.evaluate(runner, results)
File "/home/bhavya.goyal/Documents/SimIPU/mmdetection/mmdet/core/evaluation/eval_hooks.py", line 177, in evaluate
results, logger=runner.logger, **self.eval_kwargs)
File "/home/bhavya.goyal/Documents/SimIPU/mmdet3d/datasets/kitti_dataset.py", line 412, in evaluate
eval_types=eval_types)
File "/home/bhavya.goyal/Documents/SimIPU/mmdet3d/core/evaluation/kitti_utils/eval.py", line 709, in kitti_eval
eval_types)
File "/home/bhavya.goyal/Documents/SimIPU/mmdet3d/core/evaluation/kitti_utils/eval.py", line 613, in do_eval
min_overlaps)
File "/home/bhavya.goyal/Documents/SimIPU/mmdet3d/core/evaluation/kitti_utils/eval.py", line 479, in eval_class
rets = calculate_iou_partly(dt_annos, gt_annos, metric, num_parts)
File "/home/bhavya.goyal/Documents/SimIPU/mmdet3d/core/evaluation/kitti_utils/eval.py", line 382, in calculate_iou_partly
dt_boxes).astype(np.float64)
File "/home/bhavya.goyal/Documents/SimIPU/mmdet3d/core/evaluation/kitti_utils/eval.py", line 116, in bev_box_overlap
from .rotate_iou import rotate_iou_gpu_eval
File "/home/bhavya.goyal/Documents/SimIPU/mmdet3d/core/evaluation/kitti_utils/rotate_iou.py", line 292, in <module>
criterion=-1):
File "/home/bhavya.goyal/miniconda3/envs/simipuenv2/lib/python3.7/site-packages/numba/cuda/decorators.py", line 101, in kernel_jit
kernel.bind()
File "/home/bhavya.goyal/miniconda3/envs/simipuenv2/lib/python3.7/site-packages/numba/cuda/compiler.py", line 548, in bind
self._func.get()
File "/home/bhavya.goyal/miniconda3/envs/simipuenv2/lib/python3.7/site-packages/numba/cuda/compiler.py", line 426, in get
ptx = self.ptx.get()
File "/home/bhavya.goyal/miniconda3/envs/simipuenv2/lib/python3.7/site-packages/numba/cuda/compiler.py", line 397, in get
**self._extra_options)
File "/home/bhavya.goyal/miniconda3/envs/simipuenv2/lib/python3.7/site-packages/numba/cuda/cudadrv/nvvm.py", line 496, in llvm_to_ptx
ptx = cu.compile(**opts)
File "/home/bhavya.goyal/miniconda3/envs/simipuenv2/lib/python3.7/site-packages/numba/cuda/cudadrv/nvvm.py", line 233, in compile
self._try_error(err, 'Failed to compile\n')
File "/home/bhavya.goyal/miniconda3/envs/simipuenv2/lib/python3.7/site-packages/numba/cuda/cudadrv/nvvm.py", line 251, in _try_error
self.driver.check_error(err, "%s\n%s" % (msg, self.get_log()))
File "/home/bhavya.goyal/miniconda3/envs/simipuenv2/lib/python3.7/site-packages/numba/cuda/cudadrv/nvvm.py", line 141, in check_error
raise exc
numba.cuda.cudadrv.error.NvvmError: Failed to compile
<unnamed> (66, 23): parse expected comma after load's type
NVVM_ERROR_COMPILATION
Hi, thanks for sharing your great work. I encounter some issues during creating data by running create_data.py First create reduced point cloud for training set [ ] 0/3712, elapsed: 0s, ETA:Traceback (most recent call last): File "tools/create_data.py", line 247, in
out_dir=args.out_dir)
File "tools/create_data.py", line 24, in kitti_data_prep
kitti.create_reduced_point_cloud(root_path, info_prefix)
File "/mnt/lustre/chenzhuo1/hzha/SimIPU/tools/data_converter/kitti_converter.py", line 374, in create_reduced_point_cloud
_create_reduced_point_cloud(data_path, train_info_path, save_path)
File "/mnt/lustre/chenzhuo1/hzha/SimIPU/tools/data_converter/kitti_converter.py", line 314, in _create_reduced_point_cloud
count=-1).reshape([-1, num_features])
ValueError: cannot reshape array of size 461536 into shape (6)
It seems to set the num_features=4 and front_camera_id=2? in this line: https://github.com/zhyever/SimIPU/blob/5b346e392c161a5e9fdde09b1692656bc7cd3faf/tools/data_converter/kitti_converter.py#L291
I assume doing this can solve the problem but encounter another problem when Create GT Database of KittiDataset
[ ] 0/3712, elapsed: 0s, ETA:Traceback (most recent call last):
File "tools/create_data.py", line 247, in
out_dir=args.out_dir)
File "tools/create_data.py", line 44, in kitti_data_prep
with_bbox=True) # for moca
File "/mnt/lustre/chenzhuo1/hzha/SimIPU/tools/data_converter/create_gt_database.py", line 275, in create_groundtruth_database
P0 = np.array(example['P0']).reshape(4, 4)
KeyError: 'P0'
Can you help me figure out how to solve these issues?