Wordcab / wordcab-transcribe

💬 ASR FastAPI server using faster-whisper and Multi-Scale Auto-Tuning Spectral Clustering for diarization.
https://wordcab.github.io/wordcab-transcribe/
MIT License
173 stars 24 forks source link

failed Inference with YouTube #314

Open WenqiLiao opened 6 days ago

WenqiLiao commented 6 days ago

When I try to test the API locally and fetch a YouTube video with URL, and have vocab in the request body, such as:

{
  "compression_ratio_threshold": 2.4,
  "condition_on_previous_text": true,
  "diarization": false,
  "internal_vad": false,
  "log_prob_threshold": -1,
  "no_speech_threshold": 0.6,
  "num_speakers": -1,
  "repetition_penalty": 1.2,
  "source_lang": "en",
  "timestamps": "s",
  "vocab": [
    "custom company name",
    "custom product name",
    "custom co-worker name"
  ],
  "word_timestamps": false
}

the server throws the following errors:

yt_dlp.utils.DownloadError: ERROR: unable to download video data: HTTP Error 403: Forbidden or OverflowError: out of range integral type conversion attempted

`

aleksandr-smechov commented 5 days ago

@WenqiLiao does vocabulary work for you when using the YouTube endpoint? This one needs to work both with an without vocab.

hobodrifterdavid commented 5 days ago

Youtube have added more protections against downloading media in the last few weeks, I would guess this is related, try check the yt_dlp repo issues.

aleksandr-smechov commented 4 days ago

Thanks for the tip @hobodrifterdavid