X-LANCE / SLAM-LLM

Speech, Language, Audio, Music Processing with Large Language Model
MIT License
486 stars 38 forks source link

data filter mechanism #126

Open fclearner opened 1 month ago

fclearner commented 1 month ago

🚀 The feature, motivation and pitch

Hello, there, I believe we need a data filtering mechanism to handle excessively long or short data. Could you please share any simple modification strategies you have?

Alternatives

No response

Additional context

No response

fclearner commented 1 month ago

maybe try wenet data filter codes

ddlBoJack commented 1 month ago

We only filter the audio longer than 30s.