state-spaces / mamba

Mamba SSM architecture
Apache License 2.0
12.53k stars 1.05k forks source link

Vec2Vec with Mamba? #276

Open stanleyshly opened 5 months ago

stanleyshly commented 5 months ago

I'm looking to accomplish some sort of Vector to Vector task with Mamba. Does any encoder-decoder architecture exist, or any alternative approaches to this task using Mamba? Or is having the output concatenated onto the input the only option?

tridao commented 5 months ago

There's no encoder-decoder afaik, you can try concatenation.

stanleyshly commented 5 months ago

That is unfortunate. I'm trying to do text to speech generation, so using concatenation would be unideal since the speech tokens would be quite long. Would this be an issue?

I was thinking about using Mamba as a feature extractor and just using the feature embeddings in another model, but this still seems unideal.

Has their been any work done on non-autoregressive tasks with Mamba?

tridao commented 5 months ago

There are some work on "bidirectional" Mamba, you can search.

stanleyshly commented 5 months ago

I see. Seems like bidirectional Mamba has mostly been applied to images.

Is it possible to extract the embeddings and tokenize them, using Mamba as a feature extractor?