sound_event_detection一些问题

chenxie95 / deeplearning_course_sjtu

14 stars 2 forks source link

Open Shentl opened 2 years ago

Shentl commented 2 years ago

sound_event_detection里“按模型框架实现CRNN模型”，是指按给的pdf里的那张图吗？
如果按那张图，用的是linear softmax，需要输入大于0，否则Bce loss会报错，是需要Towards duration robust weakly supervised sound event detection里的Triple Threshold吗？正常的CRNN应该直接softmax就行了
如果自己加了Triple Threshold或Relu什么的，还算是“按模型框架实现CRNN模型”吗？

cantabile-kwok commented 2 years ago

要先sigmoid再过linear softmax吧

Shentl commented 2 years ago

主要是图上没有的模块不太敢随便乱加，还是上面第3点，如果自己加了什么的，还算不算“按模型框架实现CRNN模型” ？

wsntxxn commented 2 years ago

是那张图
linear softmax 是对帧级别的输出做 pooling，那么过 linear softmax 之前要得到帧级别的 prediction，所以需要过 sigmoid
triple threshold 是对输出的后处理，与模型无关；第二点已说明，模型框架已经表明需要先过 sigmoid，先实现这个模型，之后可以对模型进行修改，例如加入Relu。

wsntxxn commented 2 years ago

补充说明：代码中已有中值过滤后处理，如果仅实现这个模型，不需要改动后处理方法