abdeladim-s / subsai

🎞️ Subtitles generation tool (Web-UI + CLI + Python package) powered by OpenAI's Whisper and its variants 🎞️
https://abdeladim-s.github.io/subsai/
GNU General Public License v3.0
1.31k stars 107 forks source link

return info object to obtain language in faster-whisper #75

Closed fmorett closed 9 months ago

fmorett commented 1 year ago

Yes i know. would be great though to get some infos from the underlying models.

For us the main reason would be to get the automatically recognized language out of the model

abdeladim-s commented 1 year ago

Yes, I get your point, it would be great to have it, but the problem is not all models provide this underlying info.

If you just need faster-whisper info, and you are using it from Python, you have always the option to call the transcribe function of the model itself and customize it to your needs, without breaking the entire API, like so:

from subsai import SubsAI

subs_ai = SubsAI()
file = "../assets/audio/test0.mp3"
model = subs_ai.create_model('guillaumekln/faster-whisper', {'model_type': 'base'})
segments, info = model.model.transcribe(file, **model.transcribe_configs)
print(info)

and you can get the automatically recognized language from the info object.

Have you tried this ?