cmusphinx / pocketsphinx

A small speech recognizer
Other
3.87k stars 713 forks source link

Fast Language Detection without Transcribing #368

Closed fire17 closed 1 year ago

fire17 commented 1 year ago

Hi there :) I would like to know if you guys can fastly detect and get what language is spoken in a audio file. something like:

lang = pocketsphinx.detect_lang(audiofile_path)
print(lang) # Output: "iw-IL"

I am getting audio recordings from international users, and I dont know which language they use in advanced and would like to find how to detect which language it is quickly.

Please let me know if pocketsphinx does this or if you have anyother idea that i could try. Thanks a lot and all the best!

dhdaines commented 1 year ago

No, this functionality does not exist in pocketsphinx. Perhaps you know of another library that is designed for this? Probably there is a model on HuggingFace which can help...