Open HHHKKKHHH opened 8 months ago
I note that the convolution used in the original paper is causal convolution, but I don't seem to see an implementation of causal convolution in this project. A common grouping convolution is used in model.py. I wonder if this is correct
That part is here: https://github.com/johnma2006/mamba-minimal/blob/master/model.py#L185 For reference: https://github.com/Dao-AILab/causal-conv1d
I note that the convolution used in the original paper is causal convolution, but I don't seem to see an implementation of causal convolution in this project. A common grouping convolution is used in model.py. I wonder if this is correct