pluja / whishper

Transcribe any audio to text, translate and edit subtitles 100% locally with a web UI. Powered by whisper models!
https://whishper.net
GNU Affero General Public License v3.0
1.37k stars 79 forks source link

YouTube transcription not possible: Bot protection #112

Open liferadioat opened 1 month ago

liferadioat commented 1 month ago

Description

I am trying to transcribe a youtube video by url, but it fails.

Environment

Logs and Configuration

Docker Compose Logs

Run the following command in the project folder, force the error, and paste the logs below: docker compose logs -f --tail 50

12:27PM ERR Error downloading media error="[youtube] KUdcTGQvhKI: Sign in to confirm you’re not a bot. This helps protect our community. Learn more"
12:27PM ERR Error transcribing error="[youtube] KUdcTGQvhKI: Sign in to confirm you’re not a bot. This helps protect our community. Learn more"

Docker Compose File

version: "3.9"

services:
  mongo:
    image: mongo
    env_file:
      - .env
    restart: unless-stopped
    volumes:
      - ./whishper_data/db_data:/data/db
      - ./whishper_data/db_data/logs/:/var/log/mongodb/
    environment:
      MONGO_INITDB_ROOT_USERNAME: ${DB_USER:-whishper}
      MONGO_INITDB_ROOT_PASSWORD: ${DB_PASS:-whishper}
    expose:
      - 27017
    command: ['--logpath', '/var/log/mongodb/mongod.log']

  translate:
    container_name: whisper-libretranslate
    image: libretranslate/libretranslate:latest-cuda
    restart: unless-stopped
    volumes:
      - ./whishper_data/libretranslate/data:/home/libretranslate/.local/share
      - ./whishper_data/libretranslate/cache:/home/libretranslate/.local/cache
    env_file:
      - .env
    user: root
    tty: true
    environment:
      LT_DISABLE_WEB_UI: True
      LT_LOAD_ONLY: ${LT_LOAD_ONLY:-en,fr,es}
      LT_UPDATE_MODELS: True
    expose:
      - 5000
    networks:
      default:
        aliases:
          - translate
    deploy:
      resources:
        reservations:
          devices:
          - driver: nvidia
            count: all
            capabilities: [gpu]

  whishper:
    pull_policy: always
    image: pluja/whishper:${WHISHPER_VERSION:-latest-gpu}
    env_file:
      - .env
    volumes:
      - ./whishper_data/uploads:/app/uploads
      - ./whishper_data/logs:/var/log/whishper
    container_name: whishper
    restart: unless-stopped
    networks:
      default:
        aliases:
          - whishper
    ports:
      - 8082:80
    depends_on:
      - mongo
      - translate
    environment:
      PUBLIC_INTERNAL_API_HOST: "http://127.0.0.1:80"
      PUBLIC_TRANSLATION_API_HOST: ""
      PUBLIC_API_HOST: ${WHISHPER_HOST:-}
      PUBLIC_WHISHPER_PROFILE: gpu
      WHISPER_MODELS_DIR: /app/models
      UPLOAD_DIR: /app/uploads
    deploy:
      resources:
        reservations:
          devices:
          - driver: nvidia
            count: all
            capabilities: [gpu]
abdessalaam commented 1 month ago

I got this error too. Have you found a solution? "[youtube] BNIVH4cnT58: Sign in to confirm you’re not a bot. This helps protect our community. Learn more"

liferadioat commented 1 month ago

Unfortunatelly not - waiting for reaction of the maintainer :)

Stinosko commented 3 weeks ago

This issue is the result of Google preventing the download of video's by third parties and has nothing to do with this project. You can read the discussion on the yt-dlp project here.

Short answer is to find a VPN or proxy to circumvent this issue and limit the downloads you make. So you don't trigger the detecting on Google's servers.