Training pfld_mbv2n_112 gives `ValueError: operands could not be broadcast together`

sbocconi commented 8 months ago

Describe the bug Training with config configs/pfld/pfld_mbv2n_112.py --cfg-options data_root=datasets/meter/ gives ValueError: operands could not be broadcast together with shapes (1638,3) (6,)

Environment Environment you use when bug appears:

Python version: 3.10
PyTorch Version: torch==2.0.1
MMCV Version: 2.0.1
EdgeLab Version: na
Code you run python tools/train.py configs/pfld/pfld_mbv2n_112.py --cfg-options data_root=datasets/meter/ epochs=10

The detailed error

Traceback (most recent call last):
File "/Users/SB/Projects/Software/Zephyros/Courses/Microcontrollers/ModelAssistant/tools/train.py", line 226, in <module>
main()
File "/Users/SB/Projects/Software/Zephyros/Courses/Microcontrollers/ModelAssistant/tools/train.py", line 221, in main
runner.train()
File "/opt/homebrew/Caskroom/miniconda/base/envs/sscma/lib/python3.10/site-packages/mmengine/runner/runner.py", line 1777, in train
model = self.train_loop.run()  # type: ignore
File "/opt/homebrew/Caskroom/miniconda/base/envs/sscma/lib/python3.10/site-packages/mmengine/runner/loops.py", line 96, in run
self.run_epoch()
File "/opt/homebrew/Caskroom/miniconda/base/envs/sscma/lib/python3.10/site-packages/mmengine/runner/loops.py", line 112, in run_epoch
self.run_iter(idx, data_batch)
File "/opt/homebrew/Caskroom/miniconda/base/envs/sscma/lib/python3.10/site-packages/mmengine/runner/loops.py", line 128, in run_iter
outputs = self.runner.model.train_step(
File "/Users/SB/Projects/Software/Zephyros/Courses/Microcontrollers/ModelAssistant/sscma/models/detectors/fomo.py", line 99, in train_step
losses = self._run_forward(data, mode='loss')  # type: ignore
File "/opt/homebrew/Caskroom/miniconda/base/envs/sscma/lib/python3.10/site-packages/mmengine/model/base_model/base_model.py", line 361, in _run_forward
results = self(**data, mode=mode)
File "/opt/homebrew/Caskroom/miniconda/base/envs/sscma/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/Users/SB/Projects/Software/Zephyros/Courses/Microcontrollers/ModelAssistant/sscma/models/detectors/fomo.py", line 70, in forward
return self.loss(inputs, data_samples)
File "/opt/homebrew/Caskroom/miniconda/base/envs/sscma/lib/python3.10/site-packages/mmdet/models/detectors/single_stage.py", line 78, in loss
losses = self.bbox_head.loss(x, batch_data_samples)
File "/Users/SB/Projects/Software/Zephyros/Courses/Microcontrollers/ModelAssistant/sscma/models/heads/fomo_head.py", line 112, in loss
loss = self.loss_by_feat(pred, batch_gt_instances, batch_img_metas, batch_gt_instances_ignore)
File "/Users/SB/Projects/Software/Zephyros/Courses/Microcontrollers/ModelAssistant/sscma/models/heads/fomo_head.py", line 147, in loss_by_feat
loss, cls_loss, bg_loss, P, R, F1 = multi_apply(self.lossFunction, preds, target)
File "/opt/homebrew/Caskroom/miniconda/base/envs/sscma/lib/python3.10/site-packages/mmdet/models/utils/misc.py", line 219, in multi_apply
return tuple(map(list, zip(*map_results)))
File "/Users/SB/Projects/Software/Zephyros/Courses/Microcontrollers/ModelAssistant/sscma/models/heads/fomo_head.py", line 185, in lossFunction
P, R, F1 = self.get_pricsion_recall_f1(preds, data)
File "/Users/SB/Projects/Software/Zephyros/Courses/Microcontrollers/ModelAssistant/sscma/models/heads/fomo_head.py", line 231, in get_pricsion_recall_f1
if site in preds_index:
ValueError: operands could not be broadcast together with shapes (1638,3) (6,)

Additional context Running on Mac M2, torch cpu-only, mmcv compiled from source This line in ModelAssistant/sscma/models/heads/fomo_head.py is suspicious:

site = np.concatenate([ti, po], axis=0)

up until that line the sizes where compatible. Changing that line to a np.sum of arrays create another problem later.

MILK-BIOS commented 8 months ago

@sbocconi Thank you for your support! We have fixed the bug and you can try the latest version now!

sbocconi commented 8 months ago

Thanks, it works now!

Seeed-Studio / ModelAssistant

Training pfld_mbv2n_112 gives `ValueError: operands could not be broadcast together` #179