dmarx / video-killed-the-radio-star

Notebook and tools for end-to-end automation of music video production with generative AI
https://colab.research.google.com/github/dmarx/video-killed-the-radio-star/blob/main/Video_Killed_The_Radio_Star_Defusion.ipynb#scrollTo=oPbeyWtesAoh
MIT License
196 stars 35 forks source link

getting a ConfigAttrtibuteError when running Infer speech from audio in Collab #112

Open garphillips opened 1 year ago

garphillips commented 1 year ago

I've followed all the steps up until that point, created a key in hugging face, added a youtube link, given it name, specified all settings in "Infer speech from audio" and then on runtime I get the following error :

`--------------------------------------------------------------------------- KeyError Traceback (most recent call last) in 237 #return tiny2large, large2tiny, whispers_tokens 238 --> 239 token_large_index_segmentations = whisper_transmit_meta_across_alignment( 240 whispers, 241 large2tiny,

/usr/local/lib/python3.8/dist-packages/vktrs/asr.py in whisper_transmit_meta_across_alignment(whispers, large2tiny, whispers_tokens) 93 rec_large = {'token':whispers_tokens['large'][i]} 94 for j in result: ---> 95 rec_tiny = token_tinyindex_segmentations[j] 96 if not rec_large.get('start'): 97 rec_large['start'] = rec_tiny['start']

KeyError: 16`

Can someone tell me what I'm doing wrong?

garphillips commented 1 year ago

@dmarx do you have any recommendations?