Cliu2 / MTrans

The PyTorch implementation of 'Multimodal Transformer for Automatic 3D Annotation and Object Detection'.
Mozilla Public License 2.0
29 stars 5 forks source link

Assertion error subcloud_2d must fall inside specified image size #3

Closed paathelb closed 1 year ago

paathelb commented 1 year ago

Hello while training, I encountered this error.

===== START TRAINING ===== test

T-0 L:10.91, Seg:56.97, XYZ:3.73, IoU:17.34, R:0.10, Dr:51.20, Cf: 12.75: 100%|█████████████████████████████████████████████████████████████████████████████| 499/499 [02:40<00:00, 3.11it/s] T-1 L:7.31, Seg:69.94, XYZ:1.99, IoU:32.78, R:0.70, Dr:55.52, Cf: 13.49: 100%|██████████████████████████████████████████████████████████████████████████████| 499/499 [02:41<00:00, 3.10it/s] T-2 L:6.29, Seg:73.54, XYZ:1.50, IoU:37.30, R:1.45, Dr:60.53, Cf: 13.76: 100%|██████████████████████████████████████████████████████████████████████████████| 499/499 [02:41<00:00, 3.10it/s] T-3 L:5.79, Seg:75.41, XYZ:1.35, IoU:41.22, R:4.01, Dr:64.24, Cf: 13.34: 100%|██████████████████████████████████████████████████████████████████████████████| 499/499 [02:41<00:00, 3.08it/s] T-4 L:5.29, Seg:77.45, XYZ:1.20, IoU:44.49, R:5.97, Dr:68.00, Cf: 13.13: 100%|██████████████████████████████████████████████████████████████████████████████| 499/499 [02:41<00:00, 3.10it/s] T-5 L:4.90, Seg:78.83, XYZ:1.04, IoU:46.08, R:6.87, Dr:73.57, Cf: 12.19: 100%|██████████████████████████████████████████████████████████████████████████████| 499/499 [02:41<00:00, 3.09it/s] T-6 L:4.80, Seg:79.50, XYZ:1.08, IoU:47.87, R:9.73, Dr:74.72, Cf: 12.14: 100%|██████████████████████████████████████████████████████████████████████████████| 499/499 [02:41<00:00, 3.09it/s] T-7 L:4.61, Seg:79.85, XYZ:0.99, IoU:48.91, R:11.08, Dr:77.78, Cf: 11.99: 100%|█████████████████████████████████████████████████████████████████████████████| 499/499 [02:40<00:00, 3.11it/s] T-8 L:4.38, Seg:80.94, XYZ:0.92, IoU:50.18, R:12.24, Dr:78.23, Cf: 11.90: 100%|█████████████████████████████████████████████████████████████████████████████| 499/499 [02:39<00:00, 3.12it/s] T-9 L:4.31, Seg:81.44, XYZ:0.92, IoU:51.13, R:14.04, Dr:77.98, Cf: 11.77: 100%|█████████████████████████████████████████████████████████████████████████████| 499/499 [02:39<00:00, 3.12it/s] T-10 L:4.27, Seg:81.32, XYZ:0.89, IoU:51.78, R:15.85, Dr:77.63, Cf: 11.89: 100%|████████████████████████████████████████████████████████████████████████████| 499/499 [02:39<00:00, 3.12it/s] T-11 L:4.24, Seg:81.34, XYZ:0.94, IoU:52.46, R:17.85, Dr:80.39, Cf: 11.90: 100%|████████████████████████████████████████████████████████████████████████████| 499/499 [02:39<00:00, 3.12it/s] T-12 L:4.15, Seg:81.45, XYZ:0.85, IoU:52.64, R:18.15, Dr:79.99, Cf: 12.05: 100%|████████████████████████████████████████████████████████████████████████████| 499/499 [02:39<00:00, 3.12it/s] T-13 L:3.98, Seg:82.50, XYZ:0.84, IoU:53.88, R:20.56, Dr:81.49, Cf: 11.92: 100%|████████████████████████████████████████████████████████████████████████████| 499/499 [02:39<00:00, 3.13it/s] T-14 L:3.91, Seg:82.76, XYZ:0.82, IoU:54.59, R:21.41, Dr:82.45, Cf: 11.88: 100%|████████████████████████████████████████████████████████████████████████████| 499/499 [02:39<00:00, 3.12it/s] T-15 L:3.86, Seg:82.69, XYZ:0.82, IoU:55.13, R:23.22, Dr:82.80, Cf: 11.20: 100%|████████████████████████████████████████████████████████████████████████████| 499/499 [02:39<00:00, 3.13it/s] T-16 L:3.75, Seg:83.41, XYZ:0.79, IoU:55.82, R:22.72, Dr:83.80, Cf: 10.78: 100%|████████████████████████████████████████████████████████████████████████████| 499/499 [02:39<00:00, 3.12it/s] T-17 L:2.91, Seg:89.74, XYZ:0.81, IoU:64.15, R:0.00, Dr:75.00, Cf: 9.74: 0%|▏ | 1/499 [00:01<13:03, 1.57s/it] Traceback (most recent call last): File "train.py", line 335, in main(cfg, args.cfg_file) File "train.py", line 306, in main train_one_epoch(cfg, model, training_loader, unlabeled_training_loader, optim, scheduler, counter, histo_counter, epoch, writer) File "train.py", line 79, in train_one_epoch unlabeled_data = next(unlabeled_iter) File "/home/hpaat/.conda/envs/mtrans/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 517, in next data = self._next_data() File "/home/hpaat/.conda/envs/mtrans/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1179, in _next_data return self._process_data(data) File "/home/hpaat/.conda/envs/mtrans/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1225, in _process_data data.reraise() File "/home/hpaat/.conda/envs/mtrans/lib/python3.7/site-packages/torch/_utils.py", line 429, in reraise raise self.exc_type(msg) AssertionError: Caught AssertionError in DataLoader worker process 1. Original Traceback (most recent call last): File "/home/hpaat/.conda/envs/mtrans/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 202, in _worker_loop data = fetcher.fetch(index) File "/home/hpaat/.conda/envs/mtrans/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/hpaat/.conda/envs/mtrans/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/import/home/hpaat/my_exp/MTrans/datasets/kitti_detection.py", line 233, in getitem obj = self.load_object_full_data(obj) File "/import/home/hpaat/my_exp/MTrans/datasets/kitti_detection.py", line 283, in load_object_full_data assert np.logical_and.reduce([crop_sub_cloud2d[:, 0]>0, crop_sub_cloud2d[:,0]<112, crop_sub_cloud2d[:, 1]>0, crop_sub_cloud2d[:,1]<112]).all() AssertionError

Any way to fix this?

smueleg commented 1 year ago

I have the same error!

Cliu2 commented 1 year ago

Hi, I rerun the codes with the default configuration as in this file but cannot repeat this error. Have you made any modifications to the codes or the configuration?

Can you try printing out the min&max values of the 'crop_sub_cloud2d' when the error happens? It may be a rounding error during the image & 2D coordinates reshaping.

smueleg commented 1 year ago

I get

out_shape: 112
crop_sub_cloud2d[:, 0].min(): 0.0
crop_sub_cloud2d[:, 0].max(): 111.2745253164557
crop_sub_cloud2d[:, 1].min(): 31.632940751087816
crop_sub_cloud2d[:, 1].max(): 79.43735475178006

Should >= and <= be used in assert np.logical_and.reduce([crop_sub_cloud2d[:, 0]>0, crop_sub_cloud2d[:,0]<out_shape, crop_sub_cloud2d[:, 1]>0, crop_sub_cloud2d[:,1] <out_shape]).all() ?

Cliu2 commented 1 year ago

I think the problem should be caused by some 3D points being projected exactly on the 2D box border. The case should happen with a small probability but indeed will result in the assertion error.

Thanks @smueleg for the proposed solution.

We also provide another fix to the code that is more like our original intention. Please see the new commit, at Line215, kitti_detection.py. Specifically, we change >= and <= to > and <, so that points located on exactly the border will be omitted.

We have tested the above fix, and it will not harm the accuracy.

smueleg commented 1 year ago

Thanks for your answer. Despite the fix, I'm still getting the same error.

Cliu2 commented 1 year ago

Thanks for your answer. Despite the fix, I'm still getting the same error.

Hi, have you deleted the 'gt_base' and 'processed' directories under the dataset root? Because the dataset needs to be rebuilt, you will need to manually delete them, so that the code won't be skipped due to the existing files.

smueleg commented 1 year ago

You're right, I didn't. Now it works, thank you!

paathelb commented 1 year ago

Thanks! Already works for me!