state-spaces / mamba

Mamba SSM architecture
Apache License 2.0
13.2k stars 1.12k forks source link

In efficiency in causal-conv1d during token generation? #238

Open llmexperiment opened 8 months ago

llmexperiment commented 8 months ago

Hi All, @tridao @albertfgu Do we have to use causal-conv1d during token generation? During token generation, essentially, we can just use dot product of weights and the inputs of size 4.
We can use causal-conv1d for conv1d during token generation, but it does unnecessary computations.

tridao commented 8 months ago

causal-conv1d is optional, see here.

llmexperiment commented 8 months ago

causal-conv1d is optional, see here.

Thank you @tridao !

Here is my understanding of casual-conv, and let me know these are correct?