NavodPeiris / speechlib

speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with actual speaker names
MIT License
114 stars 10 forks source link

Leave old branch without faster whisper for AMD,Intel and other users? #31

Closed tomich closed 1 month ago

tomich commented 1 month ago

With my AMD, Intel, raspberry pi GPUs I can use GPU accelerated whisper and GPU accelerated pyannote. In the specific case on AMD by using pytorch-rocm and setting whisper and pyannote to use cuda.

But I cannot use faster whisper because ctranslate2 does not work on AMD.

Could it be possible to have a branch without faster whisper for AMD (and other) users?

NavodPeiris commented 1 month ago

you can download version 1.0.7 which is before we added faster whisper.

pip install speechlib==1.0.7

usage of old API:

from speechlib import Transcriptor

file = "obama_zach.wav"
voices_folder = "voices"
language = "english"
log_folder = "logs"
modelSize = "medium"

transcriptor = Transcriptor(file, log_folder, language, modelSize, voices_folder)

res = transcriptor.transcribe()

print("res", res)

note: back then we didn't have quantization and language is specified by its name and not using the language code.

tomich commented 1 month ago

Thankss