Closed goose-ws closed 10 months ago
Thanks for the issue. I can't replicate your problem with the channel you specified above, however as it's been reported by multiple people it's safe to assume it's a real issue. If I were to guess it appears to be that the formats available for a media item when first indexed by yt-dlp
(called via the tubesync worker) are limited, resulting in no match. These formats are then updated and show the full list of formats at some unknown time later. The two situations this appears to occur is live streams when they have only just been converted to VODs and the media hasn't finished transcoding all the formats yet or older items which (total wild guess here with no evidence but seems logical) I'm assuming all but the most common formats drop out of some cache and then get restored when viewed.
I'll look into a command to refresh the metadata for items marked as failed to find match, that's probably a reasonably hacky fix for now. I did investigate pausing media then marking them to refresh metadata after 24h but the UI was overly confusing.
ID: 616 , Premium (1920x1080), fps:24.0, video:vp09.00.40.08 @2959.841k (matched)
@meeb would the issue here be that it is matching Premium quality that isn't able to be downloaded? I think I am running into this issue
Yes, I've no experience to date with some formats being marked as premium but that could well be it. If some 1080p formats are now being paywalled at YouTube then this would need specific handling to avoid them. Thanks for the suggestion. Luckily, if that is the case just a "ignore formats with 'premium' labels" should be quite easy to add.
I have the same issue - so far tried 3 random videos and neither of them downloaded automatically all failing with the described issue - one example would be this video https://www.youtube.com/watch?v=FAzonLpYhRA&list=PLi_p9-Lox3pc4Krym0t1fI1LqtrgW32pd&index=3&t=1s
I doubt that it has anything to do with VOD - this video is 4 years old.
TubeSync version 0.13.2 yt-dlp version 2023.11.16 FFmpeg version N-112750-g6d60cc7baf-20231114.
What bugs me from the logs is this line:
2023-11-29 14:47:44,614 [tubesync/INFO] Media: Download / FAzonLpYhRA has no published date, marking to be skipped
What is this "has no published date"?
"has no published date" means the upload_date
isn't set or has an invalid value in the media items metadata returned from YouTube. There appears to be some variance in the metadata causing it to fail to parse for some people. I've not experienced this myself, so I'm relying on people testing things to find out what the problem is. See https://github.com/meeb/tubesync/issues/183 etc.
I have this same exact issue on every youtube channel.
@mitchross updated to :latest
? Check the version in your web UI footer, it should read 0.13.3
. If not, update it.
On the latest version 0.13.3
Anything in your container logs? That looks like a complete failure to download any metadata at all.
Anything in your container logs? That looks like a complete failure to download any metadata at all.
i have to force the codecs above
the defaults do not work !
This never used to be a issue.
then every time I add a source I have to force reset all tasks
Ive re-setup this app over 10 times now. There is a bug somewhere in this app.
@mitchross That is not your container logs, just some random screenshots. Container logs are the output from docker logs tubesync
for example. Look for actual errors in the logs (don't share the whole massive log just the errors). Your descriptions and screenshots are not helpful in diagnosing what is wrong with your setup. Specifically choosing a codec while the fallback is set to "get next best" does nothing as if your selected codec is unavailable it'll choose the next best codec anyway so it's not what is "fixing" your downloads. Likely just the task resetting is fixing it probably. You can't debug this without seeing the actual error in the logs.
Yea I know how docker logs work. I am a developer myself. The reason im showing the screenshots is because its the steps that work to fix the issue at times for me.
2023-12-08 13:59:08,111 [tubesync/WARNING] [youtube:tab] HTTP Error 404: Not Found. Retrying (2/3)...
2023-12-08 13:59:08,292 [tubesync/WARNING] [youtube:tab] HTTP Error 404: Not Found. Retrying (3/3)...
2023-12-08 13:59:08,465 [tubesync/WARNING] [youtube:tab] Unable to download webpage: HTTP Error 404: Not Found (caused by <HTTPError 404: Not Found>); please report this issue on https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using yt-dlp -U. Giving up after 3 retries
2023-12-08 13:59:08,626 [tubesync/WARNING] [youtube:tab] YouTube said: ERROR - Requested entity was not found.
2023-12-08 13:59:08,626 [tubesync/WARNING] [youtube:tab] HTTP Error 404: Not Found. Retrying (1/3)...
2023-12-08 13:59:08,708 [tubesync/WARNING] [youtube:tab] YouTube said: ERROR - Requested entity was not found.
2023-12-08 13:59:08,708 [tubesync/WARNING] [youtube:tab] HTTP Error 404: Not Found. Retrying (2/3)...
2023-12-08 13:59:08,803 [tubesync/WARNING] [youtube:tab] YouTube said: ERROR - Requested entity was not found.
2023-12-08 13:59:08,804 [tubesync/WARNING] [youtube:tab] HTTP Error 404: Not Found. Retrying (3/3)...
2023-12-08 13:59:08,883 [tubesync/WARNING] [youtube:tab] YouTube said: ERROR - Requested entity was not found.
2023-12-08 13:59:08,884 [tubesync/ERROR] ERROR: [youtube:tab] @mitchross2852: Unable to download API page: HTTP Error 404: Not Found (caused by <HTTPError 404: Not Found>); please report this issue on https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using yt-dlp -U
Rescheduling Index media from source "mitchross2852"
Traceback (most recent call last):
File "/usr/local/lib/python3.11/dist-packages/background_task/tasks.py", line 43, in bg_runner
func(*args, **kwargs)
File "/app/sync/tasks.py", line 168, in index_source_task
videos = source.index_media()
^^^^^^^^^^^^^^^^^^^^
File "/app/sync/models.py", line 561, in index_media
response = indexer(self.index_url)
^^^^^^^^^^^^^^^^^^^^^^^
File "/app/sync/youtube.py", line 60, in get_media_info
raise YouTubeError(f'Failed to extract_info for "{url}": No metadata was '
sync.youtube.YouTubeError: Failed to extract_info for "https://www.youtube.com/c/@mitchross2852/videos": No metadata was returned by youtube-dl, check for error messages in the logs above. This task will be retried later with an exponential backoff.
Rescheduling task Index media from source "mitchross2852" for 0:01:26 later at 2023-12-08 14:00:34.889700+00:00
Rescheduling Index media from source "whistlindiesel"
Traceback (most recent call last):
File "/usr/local/lib/python3.11/dist-packages/background_task/tasks.py", line 43, in bg_runner
func(*args, **kwargs)
File "/app/sync/tasks.py", line 168, in index_source_task
videos = source.index_media()
^^^^^^^^^^^^^^^^^^^^
File "/app/sync/models.py", line 561, in index_media
response = indexer(self.index_url)
^^^^^^^^^^^^^^^^^^^^^^^
File "/app/sync/youtube.py", line 60, in get_media_info
raise YouTubeError(f'Failed to extract_info for "{url}": No metadata was '
sync.youtube.YouTubeError: Failed to extract_info for "https://www.youtube.com/c/@whistlindiesel/videos": No metadata was returned by youtube-dl, check for error messages in the logs above. This task will be retried later with an exponential backoff.
Rescheduling task Index media from source "whistlindiesel" for 0:04:21 later at 2023-12-08 14:18:00.999037+00:00
2023-12-08 14:15:39,052 [tubesync/WARNING] [youtube:tab] HTTP Error 404: Not Found. Retrying (1/3)...
2023-12-08 14:15:39,254 [tubesync/WARNING] [youtube:tab] HTTP Error 404: Not Found. Retrying (2/3)...
2023-12-08 14:15:39,431 [tubesync/WARNING] [youtube:tab] HTTP Error 404: Not Found. Retrying (3/3)...
2023-12-08 14:15:39,617 [tubesync/WARNING] [youtube:tab] Unable to download webpage: HTTP Error 404: Not Found (caused by <HTTPError 404: Not Found>); please report this issue on https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using yt-dlp -U. Giving up after 3 retries
2023-12-08 14:15:39,812 [tubesync/WARNING] [youtube:tab] YouTube said: ERROR - Requested entity was not found.
2023-12-08 14:15:39,812 [tubesync/WARNING] [youtube:tab] HTTP Error 404: Not Found. Retrying (1/3)...
2023-12-08 14:15:39,884 [tubesync/WARNING] [youtube:tab] YouTube said: ERROR - Requested entity was not found.
2023-12-08 14:15:39,884 [tubesync/WARNING] [youtube:tab] HTTP Error 404: Not Found. Retrying (2/3)...
2023-12-08 14:15:39,956 [tubesync/WARNING] [youtube:tab] YouTube said: ERROR - Requested entity was not found.
2023-12-08 14:15:39,956 [tubesync/WARNING] [youtube:tab] HTTP Error 404: Not Found. Retrying (3/3)...
2023-12-08 14:15:40,033 [tubesync/WARNING] [youtube:tab] YouTube said: ERROR - Requested entity was not found.
2023-12-08 14:15:40,035 [tubesync/ERROR] ERROR: [youtube:tab] @mitchross2852: Unable to download API page: HTTP Error 404: Not Found (caused by <HTTPError 404: Not Found>); please report this issue on https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using yt-dlp -U
Rescheduling Index media from source "mitchross2852"
Traceback (most recent call last):
File "/usr/local/lib/python3.11/dist-packages/background_task/tasks.py", line 43, in bg_runner
func(*args, **kwargs)
File "/app/sync/tasks.py", line 168, in index_source_task
videos = source.index_media()
^^^^^^^^^^^^^^^^^^^^
File "/app/sync/models.py", line 561, in index_media
response = indexer(self.index_url)
^^^^^^^^^^^^^^^^^^^^^^^
File "/app/sync/youtube.py", line 60, in get_media_info
raise YouTubeError(f'Failed to extract_info for "{url}": No metadata was '
sync.youtube.YouTubeError: Failed to extract_info for "https://www.youtube.com/c/@mitchross2852/videos": No metadata was returned by youtube-dl, check for error messages in the logs above. This task will be retried later with an exponential backoff.
Rescheduling task Index media from source "mitchross2852" for 0:21:41 later at 2023-12-08 14:37:21.044601+00:00
Logs from 12/7/2023, 2:37:01 PM
You could always try being less confrontational while getting free support for some open source software.
Anyway, thanks to the container logs your issue is nothing to do with codecs. The issue is you've added channels which aren't actually channels but handles or username aliases. Internally at YouTube some "channels" that are usernames aren't actually channels but are meta-groups of other channels and playlists (sometimes) or just aliases to internal channel IDs. This is evident because your usage of channel names 404s when TubeSync attempts to access them as channels (e.g. https://www.youtube.com/c/@whistlindiesel/videos returns a 404). While https://www.youtube.com/@whistlindiesel/videos (removing the /c/) works, this doesn't really seem to be a channel either but an alias to one.
The current suggested workaround is to use a tool like:
https://www.streamweasels.com/tools/youtube-channel-id-and-user-id-convertor/
to convert the @handles
/ @usernames
into channel IDs, then "add channel by ID" in TubeSync using the ID. For example the ID for @whistlindiesel
is UCdqp0KK_Io7TwK5cJMBvB0Q
(https://www.youtube.com/channel/UCdqp0KK_Io7TwK5cJMBvB0Q).
Convert your @handle
"channels" to IDs and add those, they should work fine regardless of your codec choice. While it is technically possible to locate the IDs automatically and there is an experimental private branch with the feature YouTube kept changing stuff while I was building it so it's not stable or reliable enough useful at the moment.
You came hard at me with being confrontational. I gave you it back.
I'll find another project that works out of the box without having to do all these work arounds.
I wasn't confrontational at all I just asked for the information to give you assistance and was verbose when you supplied something other than what I asked for. I mean, I'm not entirely sure what else you are expecting here? You're welcome I guess? The issue with channel IDs is YouTube itself changing, but sure feel free to not use the free software and ignore the free assistance that will make it work for you.
While I'm the original author of this issue, I really don't have a stake in any of the comments below my own, I built my own tool to get around this issue for the time being. That said, as an uninvolved third party, @meeb does not appear confrontational in any of his responses. I mention this as a reality check to reconsider your stance of hostility, he comes across as just trying to be helpful.
Cheers, @goose-ws - I've created a stand-alone issue which tracks your original report better now it's confirmed what your original issue was caused by. Even if you don't use TubeSync further your report was useful, thanks. https://github.com/meeb/tubesync/issues/446 for reference.
Thanks, good to know @meeb. My tool is just a script with some hard coded variables and very few sanity checks, so I'll probably switch back to your more polished tool in the soon-ish future. Thanks for sharing your hard work with us all, and offering support too. Cheers.
Coming from my experience on Tube-Archivist, I suspect this issue could be related to rate-limits, as changing the IP address would often allow the download to go through. Those affected by this issue should confirm this hypothesis.
As taken from the Docker logs of a user above, special attention to the 404 errors:
2023-12-08 14:15:39,052 [tubesync/WARNING] [youtube:tab] HTTP Error 404: Not Found. Retrying (1/3)...
2023-12-08 14:15:39,254 [tubesync/WARNING] [youtube:tab] HTTP Error 404: Not Found. Retrying (2/3)...
2023-12-08 14:15:39,431 [tubesync/WARNING] [youtube:tab] HTTP Error 404: Not Found. Retrying (3/3)...
2023-12-08 14:15:39,617 [tubesync/WARNING] [youtube:tab] Unable to download webpage: HTTP Error 404: Not Found (caused by <HTTPError 404: Not Found>); please report this issue on https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using yt-dlp -U. Giving up after 3 retries
2023-12-08 14:15:39,812 [tubesync/WARNING] [youtube:tab] YouTube said: ERROR - Requested entity was not found.
2023-12-08 14:15:39,812 [tubesync/WARNING] [youtube:tab] HTTP Error 404: Not Found. Retrying (1/3)...
2023-12-08 14:15:39,884 [tubesync/WARNING] [youtube:tab] YouTube said: ERROR - Requested entity was not found.
2023-12-08 14:15:39,884 [tubesync/WARNING] [youtube:tab] HTTP Error 404: Not Found. Retrying (2/3)...
2023-12-08 14:15:39,956 [tubesync/WARNING] [youtube:tab] YouTube said: ERROR - Requested entity was not found.
2023-12-08 14:15:39,956 [tubesync/WARNING] [youtube:tab] HTTP Error 404: Not Found. Retrying (3/3)...
So I've checked out issues #341, #336, and #287; however, I'm not able to easily find a solution to what I believe is a bug.
Version info up front: TubeSync: 0.12.1 yt-dlp: 2023.07.06 FFmpeg: N-111432-g374184a4dc-20230714
Running in Docker, happy to provide a copy of the compose if wanted, but I don't think it's relevant.
I have a channel set up in the following way:
When I add this task, 10 videos fail:
https://www.youtube.com/watch?v=ZEffdT_eRC8 https://www.youtube.com/watch?v=F4we73GHH9k https://www.youtube.com/watch?v=1pKdwyW7exk https://www.youtube.com/watch?v=Jck0nkixdiE https://www.youtube.com/watch?v=0rryLHzfh1Y https://www.youtube.com/watch?v=OyuL5biOQ94 https://www.youtube.com/watch?v=QAFfS5Ev15I https://www.youtube.com/watch?v=K-QZtk1bhWI https://www.youtube.com/watch?v=81GDlmDa-uQ https://www.youtube.com/watch?v=fIoUpuUbFgg
All of them with the error message:
Media cannot be downloaded because it has no formats which match the source requirements.
To narrow in on one example -- https://www.youtube.com/watch?v=fIoUpuUbFgg -- specifically:
I get the following available formats:
And TubeSync gives me the following Matched formats:
Manually marking this media to be skipped and then unmarking it, as described in #336, does trigger the media to download correctly. But I am already on the latest docker image, with the most recent version of
yt-dlp
.Is it just a
yt-dlp
issue that I have to live with until they release a new version, could there be a workaround?Edit:
If there's a way to do this via
docker exec
, perhaps a workaround could be a command line execution that manually skips and then unskips all media which has failed due to no match? Not the ideal solution, but it could help for now? I'm not sure if such a command bakes in to the/app/manage.py
that could do that.