Open Tortoise17 opened 1 year ago
Greetings my fellow Slovak-language user. To use that model, you just have to add it to the whisperx/alignment.py
file (before you locally install the package).
In other (English) words::
git clone https://github.com/m-bain/whisperX.git
whisperX/whisperx/alignment.py
and add your model to DEFAULT_ALIGN_MODELS_HF: "sk": "infinitejoy/wav2vec2-large-xls-r-300m-slovak"
pip install -e whisperX
apt update && apt install ffmpeg
(use sudo
if required)pip install setuptools-rust
import whisperx
device = "cuda"
audio_file = "dobre_rano_prvym_rozsudkom_sa_nic_nekonci_imrecze_moze_skoncit_aj_za_mrezami_10_2_2023.mp3"
# transcribe with original whisper using the Large mode
model = whisperx.load_model("large", device)
result = model.transcribe(audio_file, verbose=False) # quickly get some popcorn, it will take ~10 min
# Print me some segments so we can prove in a minute if the model works
for segment in result["segments"]:
print(segment)
# Now comes the important part. Set the language code to 'sk' as used in the whisperx/alignment.py file (you would get error if the language is not known)
model_a, metadata = whisperx.load_align_model(language_code="sk", device=device)
# Run the alignment. This one is rather fast, no time to get additional chips
result_aligned = whisperx.align(result["segments"], model_a, metadata, audio_file, device)
# Format the output to similar result as for whisper segments
for segment in result_aligned["segments"]:
print(f"'start: '{segment['start']}, 'end': {segment['end']}, 'text': {segment['text']}")
I am not suggesting this is a good wav2vec model to use, these are only the instructions how utilise it in WhisperX. Try yourself and you will see.
@Tortoise17 any results, was this a good wav2vec2 model? I will add to defaults if you found it successful
Dear Friends. I have to ask that can this model be used for Slovak alignment?
https://huggingface.co/infinitejoy/wav2vec2-large-xls-r-300m-slovak
For me the confusing point is the labels. From where the labels come if this can be used.Please if you can guide.