BICLab / EMS-YOLO

Offical implementation of "Deep Directly-Trained Spiking Neural Networks for Object Detection" (ICCV2023)
https://arxiv.org/abs/2307.11411
GNU General Public License v3.0
139 stars 12 forks source link

The provided code doesn't align with the paper #15

Open JDYG opened 5 months ago

JDYG commented 5 months ago

Hi, I found that are some discrepancies between your code and paper.

  1. What is the input for Detect head? As claimed in the paper, the last membrane potential of neurons are fed into each detector. However, in your provided code, the input to the Detect layer comes from BasicBlock_ms, which means that the output of BasicBlock_ms is also the convolution of spikes instead of membrane potential.

Could you please explain where you used the membrane potential as the training data for the detection?

  1. Which model is corresponding to the Figure2 in your paper?

In the README.md, you use the ResNet34 model, which I guess should correspond to the resnet34.yaml. However, when I check the content in the resnet34.yaml, It seems that only BasicBlock_2 is used in the backbone, and the MS_block isn't used in the network, which is not consistent with the EMS-Module2 presented in Figure 2.

  1. Where should the MaxPool be? In the paper, the Maxpool operation is applied first followed by the LCB block. However, in the Concat_res2 and BasicBlock_ms code, the LCB block is done first, and then the Maxpool operation is performed after concatenation.

  2. The parameters' size is not consistent with the data provided in the paper. In Table 2, the authors claim that the EMS-Res10 parameter size is 6.20M. However, according to the provided trained weights, the parameters of ResNet34 are 33.94M. If it is necessary to use ResNet34 to achieve good results, why is Res10 presented in the paper? Could you please provide the trained weights for EMS-Res10?

# the code for load the model information
from models.experimental import attempt_load
from utils.torch_utils import model_info

w_path = './best.pt'
model = attempt_load(w_path)
print(model)
model_info(model)

# output is : Model Summary: 325 layers, 33940542 parameters, 0 gradients, 0.0 GFLOPs
weinijuan commented 2 days ago

I think so.