jianfch / stable-ts

Transcription, forced alignment, and audio indexing with OpenAI's Whisper
MIT License
1.59k stars 177 forks source link

ValueError: This model doesn't have language tokens so it can't perform lang id #229

Closed mytricker0 closed 1 year ago

mytricker0 commented 1 year ago

when running this:

devices = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") 
model = stable_whisper.load_model('base',device=devices)
video = VideoFileClip(input)

# Extract the audio
audio = video.audio

# Write the audio to an MP3 file
audio.write_audiofile("output_audio.mp3")

# Close the audio file (if you're done with it)
audio.close()

result = model.transcribe('output_audio.mp3')
result.split_by_length(max_words=words)
result.to_srt_vtt('audio.srt')
mytricker0 commented 1 year ago

actually just had to do :

pip install -U git+https://github.com/jianfch/stable-ts.git

then I has this error:

OSError: libcudart.so.11.0: cannot open shared object file: No such file or directory

wich was solved by doing:

conda install cudatoolkit

Wich landed on this error:

undefined symbol: _ZN2at4_ops6conv1d4callERKNS_6TensorES4_RKN3c108optionalIS2_EENS5_8ArrayRefIlEESB_SB_l

wich was solved by doing this :

pip install -U torch torchaudio --no-cache-dir