i404788 / s5-pytorch

Pytorch implementation of Simplified Structured State-Spaces for Sequence Modeling (S5)
Mozilla Public License 2.0
53 stars 2 forks source link

Using S5 as an encoder #8

Closed stevenwong closed 12 hours ago

stevenwong commented 1 month ago

Hi there,

Thanks for publishing this library!

This is a question rather than an issue. I'm using $x_{t-k}, ..., xt$ to predict $f(x{t+2}, ..., x_{t+6})$. As an example, $f(\cdot)$ can be the mean of $x$ over $t+2$ to $t+6$. If I were to use LSTM or a transformer, I simply use them like an encoder. That is, I pass the sequence through and get the encoded representation, then I pass it to a FF layer to predict y. How would you do this with s5?

Rather than passing the last latent representation to a FF layer, I can also get the SSM to predict $x_{t+6}$ and just let the FF layer to fill in the blanks because it's just an extrapolation anyway. How would you vary the sample frequency in the forward pass?

[Edit] Nvm. I think this functionality is exposed in S5.forward() [\Edit]

i404788 commented 1 month ago

Hey,

Yes you can use S5 module to get the just SSM and use it the same way as transformer or LSTM/RNN (if return_state=True). S5 is (like LSTM) only in one direction (counting up in index at axis=-2), so if you want to predict a lower t it may help to reverse your sequence. If you want bidirectional you can set bidir=True (only state carrying won't work anymore).

For the sample frequency you can check this issue for a reference: https://github.com/i404788/s5-pytorch/issues/7 Note that it will take significantly more memory than the default constant step-size.