X-LANCE / SLAM-LLM

Speech, Language, Audio, Music Processing with Large Language Model
MIT License
397 stars 31 forks source link

Is it possible to handle speech front-end signal processing tasks? #88

Open zuowanbushiwo opened 1 month ago

zuowanbushiwo commented 1 month ago

🚀 The feature, motivation and pitch

like Adaptive noise suppression,Acoustic echo cancellation,Speech Seperation task, thanks!

Alternatives

No response

Additional context

No response

zszheng147 commented 1 month ago

We may support these tasks in the near future, but perhaps there are already off-the-shelf repositories that support them, such as Asteroid. You can check those first before we proceed with these tasks.

zuowanbushiwo commented 1 month ago

@zszheng147 Thanks!I think LLM may have better results

zszheng147 commented 1 month ago

Is there any theoretical support for your hypothesis? Could you share it with us? Perhaps you can take a look at our Spatial Audio Understanding. We believe LLM can implicitly achieve source separation, but that requires the construction of an exquisite instruction tuning dataset.