subdigital / kv-downloader

A utility that automates a workflow for downloading individual tracks from Karaoke Version.
MIT License
3 stars 0 forks source link

Consistent timeout error on a specific page #7

Open vbrilon opened 2 days ago

vbrilon commented 2 days ago

Trying to download all the tracks from here: https://www.karaoke-version.com/custombackingtrack/creedence-clearwater-revival/bad-moon-rising.html

The downloader consistently dies as it downloads the next to the last track in the list. So it gets "Rhythm Electric Guitar", but then shows this error and exist before downloading "Lead Vocal".

Is there additional debug info I can give to help with this?


2024-11-21T06:25:11.922501Z DEBUG headless_chrome::browser::tab: Waiting for element with selector: ".begin-download"
thread 'main' panicked at src/tasks/download_song.rs:106:18:
Timed out waiting for download modal.: The event waited for never came
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
2024-11-21T06:25:42.005548Z  INFO headless_chrome::browser: Dropping browser
2024-11-21T06:25:42.105280Z  INFO headless_chrome::browser::process: Killing Chrome. PID: 79759
2024-11-21T06:25:42.107916Z  WARN headless_chrome::browser::transport: Couldn't send browser an event: "TargetInfoChanged(TargetInfoChangedEvent { params: TargetInfoChangedEventParams { target_info: TargetInfo { target_id: \"8470417807B94375933B0FB44169CAA7\", Type: \"page\", title: \"Download your instrumental songs in MP3 format - Custom Backing Tracks - Karaoke Version\", url: \"https://www.karaoke-version.com/\", attached: false, opener_id: None, can_access_opener: false, opener_frame_id: None, browser_"
SendError { .. }
2024-11-21T06:25:42.107941Z  INFO headless_chrome::browser::transport: Shutting down message handling loop
2024-11-21T06:25:42.111276Z  INFO headless_chrome::browser::transport::web_socket_connection: Sending shutdown message to message handling loop
2024-11-21T06:25:42.113944Z  INFO headless_chrome::browser::transport: cleared listeners, I think
2024-11-21T06:25:42.114216Z  INFO headless_chrome::browser::tab: finished tab's event handling loop
2024-11-21T06:25:42.114522Z  INFO headless_chrome::browser::tab: finished tab's event handling loop
2024-11-21T06:25:42.115084Z  INFO headless_chrome::browser::tab: finished tab's event handling loop
2024-11-21T06:25:42.124676Z  WARN headless_chrome::browser::process: Failed to close temporary directory: No such file or directory (os error 2) at path "/var/folders/0q/w2h32dsn2vx2r8f_0czhw90r0000gn/T/rust-headless-chrome-profile0U3d58"
2024-11-21T06:25:42.124841Z  INFO headless_chrome::browser::transport: dropping transport
2024-11-21T06:25:42.124844Z  INFO headless_chrome::browser::transport::web_socket_connection: dropping websocket connection
vbrilon commented 2 days ago

I can reproduce this error consistently on pretty much every page now -- it seems that the site triggers some sort of throttling and the downloader times out maybe?

subdigital commented 2 days ago

I do get this type of thing from time to time and I'm not sure how to make it more robust. Increasing the wait time doesn't seem to help I don't think.

A few things to try:

subdigital commented 2 days ago

I just bought that song and was able to run through it successfully, so there's nothing wrong with that particular song page.

vbrilon commented 16 hours ago

Thanks for trying that -- you're right, sometimes the songs do come through, but once it starts happening it keeps happening for a while, until I walk away for a few hours. Smells to me that they're throttling something on their end. I strongly suspect it's the actual processing + download of the file is where the slowdown happens, but I'll spend some time with it, and doing some network profiling and see if I can figure out what's actually going on.

I also notice that as the problem gets worse, I end up with more tracks missing and/or tracks with .crdownload extensions, which feels to me like the timeout is happening while it's downloading.