Open h-zhao1997 opened 2 months ago
Yes this is a good idea. The conv1d implementatation actually already supports taking in intial states and returning final states. We just haven't had time to wired everything together.
@tridao Thank you very much for your reply! I'm really looking forward to seeing this feature integrated!
Thank you for your outstanding work! I'm curious if you've thought about including an additional parameter in the mamba_split_conv1d_scan_combined function to accept an initial_conv_state. This could open up some intriguing applications, like treating initial_conv_state as a trainable parameter. I've observed that Mamba2 has already implemented the ability to pass initial_states for the SSM layer. In your opinion, would it be beneficial to adopt a similar strategy for the 1D convolution layer?