BICLab / EMS-YOLO

Offical implementation of "Deep Directly-Trained Spiking Neural Networks for Object Detection" (ICCV2023)
https://arxiv.org/abs/2307.11411
GNU General Public License v3.0
130 stars 11 forks source link

The pretrain model on Gen1 dataset #13

Open Orekishiro opened 4 months ago

Orekishiro commented 4 months ago

In your paper, you use EMS-Res10 model and achieve 0.267 mAP on Gen1 Dataset, but I used the framework you provided to train on the Gen1 dataset, I couldn't get good results. I don't know if there were some problems in my training stage, so could you provide the trained model on Gen1 Dataset?

Orekishiro commented 4 months ago

results

108360215 commented 4 months ago

Excuse me, Did you encounter the following issue when training gen1 data? File "D:\ems\EMS_Origin\EMS-YOLO\g1\models\yolo.py", line 128, in _forward_once x = m(x) # run File "C:\Users\user\anaconda3\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, *kwargs) File "D:\ems\EMS_Origin\EMS-YOLO\g1\models\common.py", line 162, in forward return self.bn(self.conv(x)) File "C:\Users\user\anaconda3\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(args, **kwargs) File "D:\ems\EMS_Origin\EMS-YOLO\g1\models\common.py", line 190, in forward c1[i] = F.conv2d(input[i], weight, self.bias, self.stride, self.padding, self.dilation, self.groups) RuntimeError: Given groups=1, weight of size [3, 32, 2, 2], expected input[1, 3, 256, 256] to have 32 channels, but got 3 channels instead How did you solve this problem? Thanks! If you can help me I really appreciate it!

Orekishiro commented 4 months ago

Excuse me, Did you encounter the following issue when training gen1 data? File "D:\ems\EMS_Origin\EMS-YOLO\g1\models\yolo.py", line 128, in _forward_once x = m(x) # run File "C:\Users\user\anaconda3\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, *kwargs) File "D:\ems\EMS_Origin\EMS-YOLO\g1\models\common.py", line 162, in forward return self.bn(self.conv(x)) File "C:\Users\user\anaconda3\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(args, **kwargs) File "D:\ems\EMS_Origin\EMS-YOLO\g1\models\common.py", line 190, in forward c1[i] = F.conv2d(input[i], weight, self.bias, self.stride, self.padding, self.dilation, self.groups) RuntimeError: Given groups=1, weight of size [3, 32, 2, 2], expected input[1, 3, 256, 256] to have 32 channels, but got 3 channels instead How did you solve this problem? Thanks! If you can help me I really appreciate it!

I don't encounter this issue. And I think it seems like an issue about channel setting, I suggest tocheck the config on *.yaml.

108360215 commented 4 months ago

I changed the yolo.py and common.py from yolov3 source code, did you change these too? I have a favor to ask. Could you please provide me with the code related to training gen1 data that you have? I've been stuck on this architecture for a while. I would greatly appreciate it if you could help me out with this favor! "j88239806@gmail.com" It's my google drive and gmail

zhuang5252 commented 4 months ago

In your paper, you use EMS-Res10 model and achieve 0.267 mAP on Gen1 Dataset, but I used the framework you provided to train on the Gen1 dataset, I couldn't get good results. I don't know if there were some problems in my training stage, so could you provide the trained model on Gen1 Dataset?

Hello, may I ask how you downloaded the dataset? Can you please let me know

Orekishiro commented 4 months ago

In your paper, you use EMS-Res10 model and achieve 0.267 mAP on Gen1 Dataset, but I used the framework you provided to train on the Gen1 dataset, I couldn't get good results. I don't know if there were some problems in my training stage, so could you provide the trained model on Gen1 Dataset?

Hello, may I ask how you downloaded the dataset? Can you please let me know

You can download dataset at https://www.prophesee.ai/2020/01/24/prophesee-gen1-automotive-detection-dataset/

108360215 commented 4 months ago

@Orekishiro I can train now!, but when I did val, my P、R、map all are zero. Did you changed the val or something else? Or you have already tested on your test_data?

Orekishiro commented 4 months ago

@Orekishiro I can train now!, but when I did val, my P、R、map all are zero. Did you changed the val or something else? Or you have already tested on your test_data?

I did not directly run val.py, and the image shown in this issue is automatically generated after the training is completed. To run train_g1.py, I delete some unused code in val.py, such as DetectMultiBackend and plots code, they don't influence the results of the inference stage. You say that the P、R、mAP are all zero, I guess the reason is that the format of the target label. 我英文水平有限,可能表达不清。 我并没有直接去运行val.py文件,之前在这个issue上传的plt图片是在训练完成后框架自动生成的,这部分我没有去进行改动。而运行train_g1.py时,我暂时将没有用到代码都注释掉了,但它们不会影响推理的结果。你测试得到的指标都是0,这可能是因为读取label的时候格式出现问题;或者你注意一下损失函数的曲线是否正常下降,也有可能模型没有学好。 image

108360215 commented 4 months ago

感謝你的回覆! 我再去試試看有關val的部分以及我的label格式,你方便加個discord嗎名字是diopang,我是碩士生現在正在研究關於event camera用SNN做object detection

jsckdon commented 4 months ago

感謝你的回覆! 我再去試試看有關val的部分以及我的label格式,你方便加個discord嗎名字是diopang,我是碩士生現在正在研究關於event camera用SNN做object detection

May I ask how the event data was placed in the folder? I have been confused by this question for some time

jsckdon commented 4 months ago

结果

我看您的结果的map在训练一段时间后呈下降的趋势,请问这是这个代码的问题吗还是其他的问题

Orekishiro commented 4 months ago

结果

我看您的结果的map在训练一段时间后呈下降的趋势,请问这是这个代码的问题吗还是其他的问题 这个我估计过拟合了,可能是我参数设置问题,后来我换了个res34,基本上能有0.3多

jsckdon commented 4 months ago

结果

我看您的结果的map在训练一段时间后呈下降的趋势,请问这是这个代码的问题吗还是其他的问题 这个我估计过拟合了,可能是我参数设置问题,后来我换了个res34,基本上能有0.3多 噢噢噢噢明白,我想请问下您的那个事件数据集是怎么放的,我最近开始跑那个事件数据集的代码,数据集放的一直不对

Orekishiro commented 4 months ago

结果

我看您的结果的map在训练一段时间后呈下降的趋势,请问这是这个代码的问题吗还是其他的问题 这个我估计过拟合了,可能是我参数设置问题,后来我换了个res34,基本上能有0.3多 噢噢噢噢明白,我想请问下您的那个事件数据集是怎么放的,我最近开始跑那个事件数据集的代码,数据集放的一直不对

EMS-YOLO他这个框架的逻辑,应该是先用give_g1_data.py缓存下来了事件表示和label标签,之后再用datasets_g1T.py进行加载。 我是对数据集加载部分进行了修改,将数据集分别放入train/val/test文件夹下,根据传入的mode确定加载的数据集,然后重写了加载逻辑。 EMS-YOLO应该是参考https://github.com/loiccordone/object-detection-with-spiking-neural-networks/ 这个项目中的datasets/gen1_od_dataset.py 写的数据加载,EMS用的Yolov3框架和他的区别在于需要预先加载所有的Label进行自适应锚框。

jsckdon commented 4 months ago

结果

我看您的结果的map在训练一段时间后呈下降的趋势,请问这是这个代码的问题吗还是其他的问题 这个我估计过拟合了,可能是我参数设置问题,后来我换了个res34,基本上能有0.3多 噢噢噢噢明白,我想请问下您的那个事件数据集是怎么放的,我最近开始跑那个事件数据集的代码,数据集放的一直不对

EMS-YOLO他这个框架的逻辑,应该是先用give_g1_data.py缓存下来了事件表示和label标签,之后再用datasets_g1T.py进行加载。 我是对数据集加载部分进行了修改,将数据集分别放入train/val/test文件夹下,根据传入的mode确定加载的数据集,然后重写了加载逻辑。 EMS-YOLO应该是参考https://github.com/loiccordone/object-detection-with-spiking-neural-networks/ 这个项目中的datasets/gen1_od_dataset.py 写的数据加载,EMS用的Yolov3框架和他的区别在于需要预先加载所有的Label进行自适应锚框。

非常感谢您的回复!

108360215 commented 4 months ago

@Orekishiro 所以你會先用give_g1_data.py 生成出.npy檔案後 再用datasets_g1T來create_dataloader是嗎

Orekishiro commented 4 months ago

@Orekishiro 所以你會先用give_g1_data.py 生成出.npy檔案後 再用datasets_g1T來create_dataloader是嗎

这个没有,我没用他原来代码读取。我是先找label的时间戳,然后截取这一段的事件转为numpy存储,同时生成事件表示,再重写了一个Dataset读的numpy文件,其实逻辑和原EMS是相同的。 当然你也可以参考 https://github.com/uzh-rpg/RVT RVT提供了Gen1和1Mpx数据集的h5文件(包括原始的事件和20通道的事件表示),逻辑上和读dat没啥区别,但h5占用硬盘空间要小一些。

108360215 commented 4 months ago

@Orekishiro 你有看過他對應的img跟label的anchor嗎 我顯示後發現anchor都歪掉 event_label

108360215 commented 4 months ago

但是我如果將label的框往左上角調整,就會換成我training時的框也跟著往左上角偏移導致沒有抓到

Orekishiro commented 4 months ago

但是我如果將label的框往左上角調整,就會換成我training時的框也跟著往左上角偏移導致沒有抓到

我也可视化过,Anchor是能够对上的;有可能anchor格式问题,你可以确定一下是目标框的左上顶点,(x1,y1,w,h),还是目标框的中心(xc,yc,w,h),看你的结果好像左上顶点正好在物体中心位置,有可能是这里出了问题。 image

jsckdon commented 4 months ago

@Orekishiro 请问您有没有遇到过这个tensor的问题, File "E:\EMS-YOLO-main\models\yolo.py", line 137, in forward input[i] = x RuntimeError: expand(torch.cuda.FloatTensor{[2, 5, 3, 320, 320]}, size=[2, 5, 3, 320]): the number of sizes provided (4) must be greater or equal to the number of dimensions in the tensor (5)就是我打印了这个x和input的shape,一开始都是正常的,类似于torch.Size([1, 3, 256, 256])

torch.Size([3, 1, 3, 256, 256])这种,然后要训练的时候就报错了,那个时候的tensor就变成了这样torch.Size([2, 5, 3, 320, 320])

torch.Size([3, 2, 5, 3, 320])

108360215 commented 4 months ago

@Orekishiro 感謝你的回覆! 想問你有對val.py做修改嗎,因為我發現蠻多部分有缺少的

Orekishiro commented 4 months ago

@Orekishiro 请问您有没有遇到过这个tensor的问题, File "E:\EMS-YOLO-main\models\yolo.py", line 137, in forward

input[i] = x RuntimeError: expand(torch.cuda.FloatTensor{[2, 5, 3, 320, 320]}, size=[2, 5, 3, 320]): the number of sizes provided (4) must be greater or equal to the number of dimensions in the tensor (5)就是我打印了这个x和input的shape,一开始都是正常的,类似于torch.Size([1, 3, 256, 256])

torch.Size([3, 1, 3, 256, 256])这种,然后要训练的时候就报错了,那个时候的tensor就变成了这样torch.Size([2, 5, 3, 320, 320])

torch.Size([3, 2, 5, 3, 320])

你可以看一下yolo.py和common.py中time_window这个参数的设置是否一样;在yolo.py里经过forward函数时,根据时间步长time_window将输入复制,输入的shape为(time_window,batch_size,C,H,W),与commom.py中的mem_update函数就可以对上了。 但实际上他在datasets_g1T.py中返回的是下图这样的shape,所以我也不太确定他最终是如何训练Gen1数据集的,也可能他上传的yolo.py是对COCO数据集的版本。 image

Orekishiro commented 4 months ago

@Orekishiro 感謝你的回覆! 想問你有對val.py做修改嗎,因為我發現蠻多部分有缺少的

这里我记得我好像是直接将那个import DetectMultiBackend给注释掉了,其他部分好像没有太修改,这个具体的记不太清了。

jsckdon commented 4 months ago

DetectMultiBackend

非常感谢您的回复,我再多尝试下

jsckdon commented 4 months ago

@Orekishiro 请问您选的什么初始权重文件呢?

Orekishiro commented 4 months ago

@Orekishiro 请问您选的什么初始权重文件呢?

我之前拿Res10跑的,没加权重;后来的Res34也没加,可能加载coco的权重会提高一些性能。

jsckdon commented 4 months ago

@Orekishiro 请问您选的什么初始权重文件呢?

我之前拿Res10跑的,没加权重;后来的Res34也没加,可能加载coco的权重会提高一些性能。

好滴好滴,谢谢您的回复!

108360215 commented 4 months ago

@Orekishiro 感謝你的回覆! 想問你有對val.py做修改嗎,因為我發現蠻多部分有缺少的

这里我记得我好像是直接将那个import DetectMultiBackend给注释掉了,其他部分好像没有太修改,这个具体的记不太清了。

好的了解! 但我試了一下我的label格式跟train格式都是由give_g1_data.py生的create_dataloader,所以應該會一樣才對,但不知為啥map P R皆為0,網路上說有可能是cuda pytroch版本問題,想問一下你的這兩個版本是啥 謝謝

108360215 commented 4 months ago

@Orekishiro 有關def non_max_suppression你有發現他回傳的output會跟Label一樣嗎

108360215 commented 3 weeks ago

@Orekishiro hello! Are you currently running a SNN object detection project?