loiccordone / object-detection-with-spiking-neural-networks

Repository code for the IJCNN 2022 paper "Object Detection with Spiking Neural Networks on Automotive Event Data"
MIT License
57 stars 12 forks source link

Error in test step #5

Closed xxyll closed 2 years ago

xxyll commented 2 years ago

I'm sorry to ask you so many questions recently, I'm embarrassed that I'm trapped by a new error again, I didn't find a solution on the net, I wonder if you have encountered it. I run python object_detection.py -backbone vgg-11 -T 5 -tbin 2 -b 8 -epochs 50 -save_ckpt -num_workers 2 for train, then I got three eckpoint files and put the best one in ‘pretrained/vgg-11/’ as pretrained_model. Then I try to run python object_detection.py -backbone vgg-11 -T 5 -tbin 2 -b 8 -pretrained pretrained/vgg-11/the_best.ckpt -no_train -test -num_workers 4 for test, but I get the following error.

Using native 16bit precision. GPU available: True, used: True TPU available: False, using: 0 TPU cores IPU available: False, using: 0 IPUs File loaded. LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1] Testing: 0it [00:00, ?it/s]Traceback (most recent call last): File "/home/lxy/anaconda3/envs/SNN-SJ/lib/python3.8/contextlib.py", line 131, in exit self.gen.throw(type, value, traceback) File "/home/lxy/anaconda3/envs/SNN-SJ/lib/python3.8/site-packages/pytorch_lightning/profiler/base.py", line 95, in profile yield action_name File "/home/lxy/anaconda3/envs/SNN-SJ/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1079, in _run_evaluate eval_loop_results = self._evaluation_loop.run() File "/home/lxy/anaconda3/envs/SNN-SJ/lib/python3.8/site-packages/pytorch_lightning/loops/base.py", line 111, in run self.advance(*args, *kwargs) File "/home/lxy/anaconda3/envs/SNN-SJ/lib/python3.8/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 110, in advance dl_outputs = self.epoch_loop.run( File "/home/lxy/anaconda3/envs/SNN-SJ/lib/python3.8/site-packages/pytorch_lightning/loops/base.py", line 111, in run self.advance(args, kwargs) File "/home/lxy/anaconda3/envs/SNN-SJ/lib/python3.8/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 110, in advance output = self.evaluation_step(batch, batch_idx, dataloader_idx) File "/home/lxy/anaconda3/envs/SNN-SJ/lib/python3.8/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 150, in evaluation_step output = self.trainer.accelerator.test_step(step_kwargs) File "/home/lxy/anaconda3/envs/SNN-SJ/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py", line 226, in test_step return self.training_type_plugin.test_step(step_kwargs.values()) File "/home/lxy/anaconda3/envs/SNN-SJ/lib/python3.8/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 181, in test_step return self.model.test_step(args, kwargs) File "/home/lxy/Experiment/object-detection-with-spiking-neural-networks/object_detection_module.py", line 106, in test_step return self.step(batch, batch_idx, mode="test") File "/home/lxy/Experiment/object-detection-with-spiking-neural-networks/object_detection_module.py", line 97, in step return loss UnboundLocalError: local variable 'loss' referenced before assignment

Some netizens solved "local variable 'loss' referenced before assignment" by adjusting batchsize, I tried to reduce batch_size to 2, but it didn't work. I also tried change num_workers to 0 or 2, but appeared "test dataloader 0, does not have many workers which may be a bottleneck..."

loiccordone commented 2 years ago

Hello, Try to replace line 97 return loss by

if mode != "test":
    return loss

The variable loss was indeed undefined in the test step, thanks for pointing it out!

xxyll commented 2 years ago

Thanks, The problem is solved! Hope it can also help others~

xxyll commented 2 years ago

Hello, I want to discuss with you about COCO mAP results. The default value of batchsize in your code is 64, but according to the actual capacity of GPU, I set up batchsize as 8 in train step and 16 in test step. My result of mAP after training is 0.056, and after testing is 0.016, whether such results depend greatly on batchsize?

ghost commented 2 years ago

hello @xxyll, I am having an issue during training. Can you please look into #2 - Issue #2. Could you please share your environment specifications for your training? Would you be interested in connecting with me so that you can help me? You can e-mail wozkoz@protonmail.com.