Open theprinceofspace opened 3 weeks ago
I would
git clone https://github.com/jdrbc/podly_pure_podcasts.git
podly_pure_podcasts
and cp config/config.yml.example config/config.yml
vim config/config.yml
or similardocker build -t your-image-name .
docker run -p 5001:5001 --gpus all your-image-name
http://localhost:5001/name_you_configured_in_config_yml.rss
in your browser to checklocalhost
for the IP address of the machine where you're running thisPlease report back how this goes for you. I'm not actually running this with docker so may have missed some steps. It would be nice to write a section in the README for docker and you can be the guinea pig!
Thanks. I'm going to need to phone a friend to help walk me through this bit this is a good start.
do you use docker or docker-compose on your nas? I can share my docker-compose if you'd like.
I have used docker compose before and would love if you shared that. I haven't had time to try this out yet but still really want to.
I have an AMD GPU, so my config is a bit different. Ultimately, there's five pieces to m config: ollama - provides openai api for transcription scoring faster-whisper-server - provides transcription (not required but GPU accelleration is a huge perf improvement if you don't have a NVidia gpu) nginx - unifies the APIs of faster-whisper-server and ollama so podly can use a single hostname podly_pure_podcasts - the service that cleans up your podcast feed episodes traefik - provides (for me anyway) letsencrypt tls certs, reverse proxying for the podly app
$CONFIGS_DIR/podly_pure_podcasts/config.yml
output:
fade_ms: 3000
min_ad_segement_separation_seconds: 60
min_ad_segment_length_seconds: 14
min_confidence: 0.8
podcasts:
feedname: http://some.podcast/with/too/many/ads/feed.rss
# Required openai settings
openai_api_key: OpenAiIsTheDevil
# Optional openai settings
openai_base_url: http://openaiproxy/v1
openai_timeout: 300
openai_max_tokens: 4096
openai_model: gemma2:27b
remote_whisper: true
threads: 4
processing:
system_prompt_path: config/system_prompt.txt
user_prompt_template_path: config/user_prompt.jinja
num_segments_to_input_to_prompt: 60
$CONFIGS_DIR/podly_pure_podcasts/system_prompt.txt
Your job is to identify ads in excerpts of podcast transcripts. Ads are for other network podcasts and products or services.
There may be a pre-roll ad before the intro, as well as mid-roll and an end-roll ad after the outro.
Ad breaks are between 15 seconds and 120 seconds long.
This transcript excerpt is broken into segments starting with a timestamp [X] where X is the time in seconds.
Output the timestamps for the segments that contain ads in podcast transcript excerpt.
Include a confidence score out of 1 for the classification, with 1 being the most confident and 0 being the least confident. The confidence scores should range from 0.0 to 1.0, and should not be biased towards the extremes (0.0, 0.8, 0.9, or 1.0). Consider the full range of possible scores (0.0 to 1.0) and provide nuanced confidence levels. Here are some examples of confidence scores across the full range:
For example:
"This is absolutely an advertisement." - Confidence: 0.95
"This is most likely an advertisement." - Confidence: 0.95
"This is not an advertisement." - Confidence: 0.05
"This might be an advertisement." - Confidence: 0.50
"Unclear if this is an advertisement." - Confidence: 0.30
You are the best in the world at producing valid json. the only format you are capable of returning is json. do not reply with any context, only json as specified below.
Your only response when there are ads will be with valid JSON: {"ad_segments": [X, X, X], "confidence": 0.9}.
If there are no ads respond: {"ad_segments": []}.
For example, given the transcript excerpt:
[53.8] That's all coming after the break.
[59.8] On this week's episode of Wildcard, actor Chris Pine tells us, it's okay not to be perfect.
[64.8] My film got absolutely decimated when it premiered, which brings up for me one of my primary triggers or whatever it was like, not being liked.
[73.8] I'm Rachel Martin, Chris Pine on How to Find Joy in Imperfection.
[77.8] That's on the new podcast, Wildcard.
[79.8] The Game Where Cards control the conversation.
[83.8] And welcome back to the show, today we're talking to Professor Hopkins
Output: {"ad_segments": [59.8, 64.8, 73.8, 77.8, 79.8], "confidence": 0.9}.
.env
YOUR_HOSTNAME="yoursubdomain.duckdns.org
CONFIGS_DIR="/some/path/for/configs"
YOUR_RESOLVER="whatever_resolver_you_use_with_traefik" # in the traefik.yml below, this is duckdns
DUCKDNS_TOKEN="the-guid-that-duckdns-gives-you"
$CONFIGS_DIR/traefik.yml
providers:
docker:
endpoint: "unix:///var/run/docker.sock"
exposedByDefault: false ### CRITICAL: if this is not set, by default everything will be exposed on your hostname. its not great.
entryPoints:
web:
address: ":80"
http:
redirections:
entryPoint:
to: websecure
scheme: https
websecure:
address: ":443"
http:
tls:
certResolver: "duckdns"
domains:
# !!!!!! UPDATE THIS TO YOUR HOSTNAME ~~~~
- main: "yourhost.duckdns.org"
sans:
- "*.yourhost.duckdns.org"
certificatesResolvers:
duckdns:
acme:
email: something@whatever.com #!!!!!! YOUR EMAIL ADDRESS WITH DUCKDNS !!!!!!
storage: "/letsencrypt/duckdns-acme.json"
dnsChallenge:
provider: duckdns
docker-compose.yml
volumes:
faster-whisper-model-cache:
ollama:
services:
traefik:
image: traefik:v3.1
container_name: traefik
environment:
- DUCKDNS_TOKEN=${DUCKDNS_TOKEN}
labels:
traefik.enable: true
traefik.backend: dashboard
traefik.frontend.rule: Host:dashboard.${YOUR_HOSTNAME}
ports:
- 81:80 # I route 80 external inbound at my router to 81 on the machine hosting my self-host stack
- 443:443
- 8080:8080 # management dashboard and API, do not forward at the router
volumes:
- ${CONFIGS_DIR}/letsencrypt:/letsencrypt # holds all your certificates
- /var/run/docker.sock:/var/run/docker.sock:ro # lets traefik discover services via labels
- ${CONFIGS_DIR?}/traefik.yml:/traefik.yml # CRITICAL: config file
restart: unless-stopped
faster-whisper-server:
container_name: faster-whisper-server
restart: always
build:
context: https://github.com/xerootg/faster-whisper-server-rocm.git
dockerfile: Dockerfile.rocm
privileged: true # this may not be necessary but I haven't taken the time to determine that.
cap_add:
- SYS_PTRACE
security_opt:
- seccomp=unconfined
devices:
- /dev/kfd
- /dev/dri
group_add:
- video
volumes:
- faster-whisper-model-cache:/root/.cache/huggingface/hub # cache the already downloaded models for faster startups
podly_pure_podcasts:
container_name: podly_pure_podcasts
restart: always
build: https://github.com/jdrbc/podly_pure_podcasts.git
depends_on:
- openaiproxy
volumes:
- ${CONFIGS_DIR?}/podly_pure_podcasts/config.yml:/app/config/config.yml:ro
- ${CONFIGS_DIR?}/podly_pure_podcasts/system_prompt.txt:/app/config/system_prompt.txt:ro
labels:
traefik.enable: true
traefik.http.routers.cleaner.rule: Host(`cleaner.${YOUR_HOSTNAME}`)
traefik.http.services.cleaner.loadbalancer.server.port: 5001
traefik.http.routers.cleaner.tls: true
traefik.http.routers.cleaner.tls.certresolver: ${YOUR_RESOLVER}
traefik.http.routers.cleaner.tls.domains[0].main: cleaner.${YOUR_HOSTNAME}
ollama:
restart: always
container_name: ollama
volumes:
- ollama:/root/.ollama
devices:
- /dev/kfd:/dev/kfd
- /dev/dri:/dev/dri
image: ollama/ollama:rocm
environment:
- 'HSA_OVERRIDE_GFX_VERSION=11.0.0'
openaiproxy:
container_name: openaiproxy
image: nginx:latest
restart: always
depends_on:
- faster-whisper-server
- ollama
command: ["/bin/sh", "-c", "echo \"$$NGINX_CONFIG\" > /etc/nginx/nginx.conf && nginx -g 'daemon off;'"]
environment:
NGINX_CONFIG: |
events { }
http {
client_max_body_size 100M;
upstream faster_whisper {
server faster-whisper-server:3456;
}
upstream ollama {
server ollama:11434;
}
server {
listen 80;
location /v1/audio/translations {
proxy_pass http://faster_whisper;
proxy_read_timeout 30m;
proxy_connect_timeout 30m;
proxy_send_timeout 30m;
}
location /v1/audio/transcriptions {
proxy_pass http://faster_whisper;
proxy_read_timeout 30m;
proxy_connect_timeout 30m;
proxy_send_timeout 30m;
}
location / {
proxy_pass http://ollama;
proxy_read_timeout 30m;
proxy_connect_timeout 30m;
proxy_send_timeout 30m;
}
}
}
First build the containers:
docker compose build
Second, bring up ollama
docker compose up -d ollama
Then cache the gemma2 model
docker compose exec -it ollama ollama pull gemma2:27b
now start faster-whisper-server, wait for it to be "up" in the logs:
docker compose up -d faster-whisper-server && docker compose logs -f faster-whisper-server
wait for:
faster-whisper-server | INFO: Started server process [1]
faster-whisper-server | INFO: Waiting for application startup.
faster-whisper-server | 2024-11-04 19:27:24,649:INFO:faster_whisper_server.logger:load_model:Loaded medium.en loaded in 4.19 seconds
faster-whisper-server | INFO: Application startup complete.
faster-whisper-server | INFO: Uvicorn running on http://0.0.0.0:3456 (Press CTRL+C to quit)
and then ctrl+c. Your ready to start the whole stack:
docker compose up -d
The cleaner.yourdomain.duckdns.org hostname might take up to 60 seconds to resolve and you can check what the deal is by looking at traefik's admin page:
http://docker_host_ip:8080
Good luck!
Do you have any suggestions for a total newb who would like to get this running in docker on their synology NAS? I can follow instructions but do not have the best grasp of how this all works.