norlab-ulaval / mask_bev

Source code for "MaskBEV: Joint Object Detection and Footprint Completion for Bird's-eye View 3D Point Clouds"
MIT License
16 stars 3 forks source link

AttributeError: 'MaskBevModule' object has no attribute '_train_metric_per_layer'. Did you mean: '_val_metric_per_layer'? #2

Open HaiCLi opened 2 months ago

HaiCLi commented 2 months ago

Hi,

I met this problem when I tried to set "configs/training/kitti/00_quick_test.yml" as training parameters. Is there any solutions?

image

Thanks!

willGuimont commented 2 months ago

Hi!

Thanks for pointing this out. It seems the issue is related to the version of pytorch-lightning. I’ve updated the requirements.txt to include the correct version.

Make sure you're using pytorch-lightning==1.9.5 with:

pip install pytorch-lightning==1.9.5

Let me know if the issue persists!

HaiCLi commented 2 months ago

This error still happened.

image

pytorch-lightning version:

image

Running command:

"python3.10 train_mask_bev.py --config configs/training/kitti/00_quick_test.yml"

willGuimont commented 2 months ago

I’ve just pushed a fix for the bug in the code. Could you check if it works on your end? Let me know how it goes!

HaiCLi commented 2 months ago

I’ve just pushed a fix for the bug in the code. Could you check if it works on your end? Let me know how it goes!

Thanks for your reply! I fixed this by myself. but met another problem. I will upload the screenshot tomorrow.

HaiCLi commented 2 months ago
"python train_mask_bev.py --config configs/training/kitti/00_quick_test.yml 
Global seed set to 420
Using GPUs [0, 1]
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
`Trainer(limit_train_batches=1)` was configured so 1 batch per epoch will be used.
`Trainer(limit_val_batches=1)` was configured so 1 batch will be used.
Traceback (most recent call last):
  File "/app/mask_bev/train_mask_bev.py", line 107, in <module>
    trainer.fit(model, datamodule)
  File "/root/miniconda3/envs/bev/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 608, in fit
    call._call_and_handle_interrupt(
  File "/root/miniconda3/envs/bev/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 36, in _call_and_handle_interrupt
    return trainer.strategy.launcher.launch(trainer_fn, *args, trainer=trainer, **kwargs)
  File "/root/miniconda3/envs/bev/lib/python3.10/site-packages/pytorch_lightning/strategies/launchers/multiprocessing.py", line 113, in launch
    mp.start_processes(
  File "/root/miniconda3/envs/bev/lib/python3.10/site-packages/torch/multiprocessing/spawn.py", line 189, in start_processes
    process.start()
  File "/root/miniconda3/envs/bev/lib/python3.10/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
  File "/root/miniconda3/envs/bev/lib/python3.10/multiprocessing/context.py", line 288, in _Popen
    return Popen(process_obj)
  File "/root/miniconda3/envs/bev/lib/python3.10/multiprocessing/popen_spawn_posix.py", line 32, in __init__
    super().__init__(process_obj)
  File "/root/miniconda3/envs/bev/lib/python3.10/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/root/miniconda3/envs/bev/lib/python3.10/multiprocessing/popen_spawn_posix.py", line 47, in _launch
    reduction.dump(process_obj, fp)
  File "/root/miniconda3/envs/bev/lib/python3.10/multiprocessing/reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'MaskArea.__init__.<locals>.<lambda>'"

I met this problem and search lots of solutions online. Some of suggestions is setting the num_worker to 0. I did it but still failed. Did you guys met this problem either during you run it? Thanks!

HaiCLi commented 2 months ago

"python train_mask_bev.py --config configs/training/kitti/00_quick_test.yml Global seed set to 420 Using GPUs [0, 1] GPU available: True (cuda), used: True TPU available: False, using: 0 TPU cores IPU available: False, using: 0 IPUs HPU available: False, using: 0 HPUs Trainer(limit_train_batches=1) was configured so 1 batch per epoch will be used. Trainer(limit_val_batches=1) was configured so 1 batch will be used. Traceback (most recent call last): File "/app/mask_bev/train_mask_bev.py", line 107, in trainer.fit(model, datamodule) File "/root/miniconda3/envs/bev/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 608, in fit call._call_and_handle_interrupt( File "/root/miniconda3/envs/bev/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 36, in _call_and_handle_interrupt return trainer.strategy.launcher.launch(trainer_fn, *args, trainer=trainer, kwargs) File "/root/miniconda3/envs/bev/lib/python3.10/site-packages/pytorch_lightning/strategies/launchers/multiprocessing.py", line 113, in launch mp.start_processes( File "/root/miniconda3/envs/bev/lib/python3.10/site-packages/torch/multiprocessing/spawn.py", line 189, in start_processes process.start() File "/root/miniconda3/envs/bev/lib/python3.10/multiprocessing/process.py", line 121, in start self._popen = self._Popen(self) File "/root/miniconda3/envs/bev/lib/python3.10/multiprocessing/context.py", line 288, in _Popen return Popen(process_obj) File "/root/miniconda3/envs/bev/lib/python3.10/multiprocessing/popen_spawn_posix.py", line 32, in init super().init(process_obj) File "/root/miniconda3/envs/bev/lib/python3.10/multiprocessing/popen_fork.py", line 19, in init self._launch(process_obj) File "/root/miniconda3/envs/bev/lib/python3.10/multiprocessing/popen_spawn_posix.py", line 47, in _launch reduction.dump(process_obj, fp) File "/root/miniconda3/envs/bev/lib/python3.10/multiprocessing/reduction.py", line 60, in dump ForkingPickler(file, protocol).dump(obj) AttributeError: Can't pickle local object 'MaskArea.init**..'"

I met this problem and search lots of solutions online. Some of suggestions is setting the num_worker to 0. I did it but still failed. Did you guys met this problem either during you run it? Thanks!

I found this error when using multi-gpus. Does this algorithm support multi-gpus?

willGuimont commented 1 month ago

Hi,

I've just pushed a fix for multi-gpu. There was a few new metrics that did not support DDP.

Does it work on your side?