Open costaht opened 2 hours ago
Hey there! Thanks for the report (:
I'd be surprised if this was the case, but it's possible! I have concurrency limited to two workers for indexing and two for downloading for a total of four concurrent yt-dlp operations (see here). You can decrease this to one worker for each (two total) by setting YT_DLP_WORKER_CONCURRENCY=1
but this usually isn't necessary. I can't replicate this locally and I haven't received any other reports of this (despite this logic being unchanged for months), but I would be curious to see your output of ps aux | grep runner
.
High CPU usage is often caused by ffmpeg assembling a video, especially if you are using the sponsorblock integration. Within reason this is both expected and unavoidable. Getting back to me with the result of that command above will help!
As for indexing, it is intentional and expected that indexing covers the entire channel and ignores filters (see here). I recognize this isn't strictly the most efficient approach, but as I see it the benefits outweigh the downsides. Besides, indexing itself should use fairly modest resources compared to actually downloading and assembling the videos.
I have concurrency limited to two workers
I believe you. I said "all" because two is all I'm using for my tests, so I guess in this case we're both right haha
I would be curious to see your output of ps aux | grep runner
:/app$ ps aux | grep runner
1000 236 0.0 0.0 6332 2176 pts/0 S+ 15:37 0:00 grep runner
:/app$ ps aux | grep pinch
1000 1 25.7 14.4 7124572 2300392 ? Ssl 15:35 0:04 /app/erts-14.2.5/bin/beam.smp -- -root /app -bindir /app/erts-14.2.5/bin -progname erl -- -home / -- -noshell -s elixir start_cli -mode embedded -setcookie WS725S7NPLJZHOWJR6AX7DOFXBG7VGXHHLEJC22GMMGAWF6I6UGQ==== -sname pinchflat -config /app/releases/2024.10.25/sys -boot /app/releases/2024.10.25/start -boot_var RELEASE_LIB /app/lib -- -extra --no-halt
1000 206 0.0 0.0 6932 3328 ? Ss 15:35 0:00 bash /app/lib/pinchflat-2024.10.25/priv/cmd_wrapper.sh /usr/local/bin/yt-dlp https://youtube.com/@letsdig18/videos --simulate --skip-download --ignore-no-formats-error --no-warnings --print-to-file %(.{id,title,live_status,webpage_url,description,aspect_ratio,duration,upload_date,timestamp,playlist_index})j /tmp/pinchflat/data/f26b3fedbb90b6c3a1a12cc4184739096855490d1c5c3236c1124714c7b8fe17.json --windows-filenames --quiet --cache-dir /tmp/pinchflat/data/yt-dlp-cache
1000 207 0.0 0.0 6932 3328 ? Ss 15:35 0:00 bash /app/lib/pinchflat-2024.10.25/priv/cmd_wrapper.sh /usr/local/bin/yt-dlp https://youtube.com/@Abom79 --simulate --skip-download --ignore-no-formats-error --no-warnings --print-to-file %(.{id,title,live_status,webpage_url,description,aspect_ratio,duration,upload_date,timestamp,playlist_index})j /tmp/pinchflat/data/a41d5747dac0b3dd42e066bc38953e15973ae50b122aa0af892c7931b44bc08b.json --windows-filenames --quiet --cache-dir /tmp/pinchflat/data/yt-dlp-cache
1000 208 18.8 0.4 74280 66120 ? S 15:35 0:01 python3 /usr/local/bin/yt-dlp https://youtube.com/@letsdig18/videos --simulate --skip-download --ignore-no-formats-error --no-warnings --print-to-file %(.{id,title,live_status,webpage_url,description,aspect_ratio,duration,upload_date,timestamp,playlist_index})j /tmp/pinchflat/data/f26b3fedbb90b6c3a1a12cc4184739096855490d1c5c3236c1124714c7b8fe17.json --windows-filenames --quiet --cache-dir /tmp/pinchflat/data/yt-dlp-cache
1000 209 0.0 0.0 6932 1312 ? S 15:35 0:00 bash /app/lib/pinchflat-2024.10.25/priv/cmd_wrapper.sh /usr/local/bin/yt-dlp https://youtube.com/@letsdig18/videos --simulate --skip-download --ignore-no-formats-error --no-warnings --print-to-file %(.{id,title,live_status,webpage_url,description,aspect_ratio,duration,upload_date,timestamp,playlist_index})j /tmp/pinchflat/data/f26b3fedbb90b6c3a1a12cc4184739096855490d1c5c3236c1124714c7b8fe17.json --windows-filenames --quiet --cache-dir /tmp/pinchflat/data/yt-dlp-cache
1000 210 0.0 0.0 6932 1444 ? S 15:35 0:00 bash /app/lib/pinchflat-2024.10.25/priv/cmd_wrapper.sh /usr/local/bin/yt-dlp https://youtube.com/@letsdig18/videos --simulate --skip-download --ignore-no-formats-error --no-warnings --print-to-file %(.{id,title,live_status,webpage_url,description,aspect_ratio,duration,upload_date,timestamp,playlist_index})j /tmp/pinchflat/data/f26b3fedbb90b6c3a1a12cc4184739096855490d1c5c3236c1124714c7b8fe17.json --windows-filenames --quiet --cache-dir /tmp/pinchflat/data/yt-dlp-cache
1000 211 18.1 0.4 74320 66484 ? S 15:35 0:01 python3 /usr/local/bin/yt-dlp https://youtube.com/@Abom79 --simulate --skip-download --ignore-no-formats-error --no-warnings --print-to-file %(.{id,title,live_status,webpage_url,description,aspect_ratio,duration,upload_date,timestamp,playlist_index})j /tmp/pinchflat/data/a41d5747dac0b3dd42e066bc38953e15973ae50b122aa0af892c7931b44bc08b.json --windows-filenames --quiet --cache-dir /tmp/pinchflat/data/yt-dlp-cache
1000 212 0.0 0.0 6932 1312 ? S 15:35 0:00 bash /app/lib/pinchflat-2024.10.25/priv/cmd_wrapper.sh /usr/local/bin/yt-dlp https://youtube.com/@Abom79 --simulate --skip-download --ignore-no-formats-error --no-warnings --print-to-file %(.{id,title,live_status,webpage_url,description,aspect_ratio,duration,upload_date,timestamp,playlist_index})j /tmp/pinchflat/data/a41d5747dac0b3dd42e066bc38953e15973ae50b122aa0af892c7931b44bc08b.json --windows-filenames --quiet --cache-dir /tmp/pinchflat/data/yt-dlp-cache
1000 213 0.0 0.0 6932 1444 ? S 15:35 0:00 bash /app/lib/pinchflat-2024.10.25/priv/cmd_wrapper.sh /usr/local/bin/yt-dlp https://youtube.com/@Abom79 --simulate --skip-download --ignore-no-formats-error --no-warnings --print-to-file %(.{id,title,live_status,webpage_url,description,aspect_ratio,duration,upload_date,timestamp,playlist_index})j /tmp/pinchflat/data/a41d5747dac0b3dd42e066bc38953e15973ae50b122aa0af892c7931b44bc08b.json --windows-filenames --quiet --cache-dir /tmp/pinchflat/data/yt-dlp-cache
High CPU usage is often caused by ffmpeg assembling a video
Actually these sources have Download Media disabled
I see it the benefits outweigh the downsides
Ok then, I'll keep going with my tests and I'll let you know if I find anything odd. It would be nice if we had a chat (irc, telegram or discord) where we users could talk about their experiences and stop cluttering the repo with issues that are actually misunderstandings.
Describe the bug I noticed that as soon as I start the container, Pinchflat starts indexing all the sources simultaneously, and it does this for the entire channel, which consumes a lot of CPU.
To Reproduce Steps to reproduce the behavior:
ps -fe | grep pinchflat
Expected behavior I believe Pinchflat should scrape only one channel at a time and only up to the Download Cutoff Date. That can be achieved by using
--break-on-reject --dateafter now-3days