huawei-noah / Efficient-Computing

Efficient computing methods developed by Huawei Noah's Ark Lab
1.2k stars 210 forks source link

[Gold-Yolo]inject module #105

Closed lkikinn closed 10 months ago

lkikinn commented 1 year ago

Why is the Inject Module divided into two parts?

lose4578 commented 1 year ago

This is the result of the precision-speed trade-off.

lkikinn commented 1 year ago

I understand that some people say that the act part can be seen as attention weight, and the whole can be seen as a process of a simple attention mechanism, like a * w + b.I don't know if that's right。

lose4578 commented 1 year ago

Yes, the implementation of inject is a variant of attention, and of course the specific operators used to implement inject can be changed according to the actual situation, without being limited to the attention structure