meeb / tubesync

Syncs YouTube channels and playlists to a locally hosted media server
GNU Affero General Public License v3.0
1.9k stars 121 forks source link

Media cannot be downloaded because it has no formats which match the source requirements. #397

Closed goose-ws closed 10 months ago

goose-ws commented 1 year ago

So I've checked out issues #341, #336, and #287; however, I'm not able to easily find a solution to what I believe is a bug.

Version info up front: TubeSync: 0.12.1 yt-dlp: 2023.07.06 FFmpeg: N-111432-g374184a4dc-20230714

Running in Docker, happy to provide a copy of the compose if wanted, but I don't think it's relevant.

I have a channel set up in the following way:

Config Flag Value
Type YouTube channel by ID
Name AcousticTrench
Media items 55
Key UCV-hwVczs0GQRApxZS4TUNA
Directory /downloads/AcousticTrench
Media format {yyyy_mmdd}{source}{title}{key}_{format}.{ext}
Example filename 2023-08-07_acoustictrench_some-media-title-name_SoMeUnIqUiD_2160p-vp9-opus-60fps-hdr.mkv
Index schedule Every 24 hours
Download media? Yes
Created 2023-08-07 14:31:20
Last crawl 2023-08-07 14:31:24
Source resolution 2160p (4K)
Source video codec VP9
Source audio codec OPUS
Prefer 60FPS Yes
Prefer HDR Yes
Output extension mkv
Fallback Get next best resolution or codec instead
Copy thumbnails Yes
Write NFO No
Write JSON Yes
Delete old media No, keep forever
UUID 9a387cf2-4a31-41d3-b16f-ffb86bbbe4ed
Embed thumbnail Yes
Embed metadata Yes
SponsorBlock Yes

When I add this task, 10 videos fail:

https://www.youtube.com/watch?v=ZEffdT_eRC8 https://www.youtube.com/watch?v=F4we73GHH9k https://www.youtube.com/watch?v=1pKdwyW7exk https://www.youtube.com/watch?v=Jck0nkixdiE https://www.youtube.com/watch?v=0rryLHzfh1Y https://www.youtube.com/watch?v=OyuL5biOQ94 https://www.youtube.com/watch?v=QAFfS5Ev15I https://www.youtube.com/watch?v=K-QZtk1bhWI https://www.youtube.com/watch?v=81GDlmDa-uQ https://www.youtube.com/watch?v=fIoUpuUbFgg

All of them with the error message: Media cannot be downloaded because it has no formats which match the source requirements.

To narrow in on one example -- https://www.youtube.com/watch?v=fIoUpuUbFgg -- specifically:

I get the following available formats:

ID: sb2
ID: sb1
ID: sb0
ID: 233
ID: 234
ID: 599 , audio:mp4a.40.5 @30.874k / 22050Hz
ID: 600 , audio:opus @39.49k / 48000Hz
ID: 139 , audio:mp4a.40.5 @48.854k / 22050Hz
ID: 249 , audio:opus @57.617k / 48000Hz
ID: 250 , audio:opus @76.29k / 48000Hz
ID: 140 , audio:mp4a.40.2 @129.543k / 44100Hz
ID: 251 , audio:opus @147.928k / 48000Hz (matched)
ID: 17 , 144p (176x144), fps:12, video:mp4v.20.3 @73.459k , audio:mp4a.40.2 @Nonek / 22050Hz
ID: 597 , 144p (256x144), fps:12, video:avc1.4d400b @33.664k
ID: 602 , (256x144), fps:12.0, video:vp09.00.10.08 @80.032k
ID: 598 , 144p (256x144), fps:12, video:vp9 @24.613k
ID: 394 , 144p (256x144), fps:24, video:av01.0.00M.08 @62.751k
ID: 269 , (256x144), fps:24.0, video:avc1.4D400C @116.032k
ID: 160 , 144p (256x144), fps:24, video:avc1.4D400C @40.326k
ID: 603 , (256x144), fps:24.0, video:vp09.00.11.08 @153.913k
ID: 278 , 144p (256x144), fps:24, video:vp09.00.11.08 @87.293k
ID: 395 , 240p (426x240), fps:24, video:av01.0.00M.08 @80.216k
ID: 229 , (426x240), fps:24.0, video:avc1.4D4015 @153.524k
ID: 133 , 240p (426x240), fps:24, video:avc1.4D4015 @65.691k
ID: 604 , (426x240), fps:24.0, video:vp09.00.20.08 @191.371k
ID: 242 , 240p (426x240), fps:24, video:vp09.00.20.08 @80.806k
ID: 396 , 360p (640x360), fps:24, video:av01.0.01M.08 @145.214k
ID: 230 , (640x360), fps:24.0, video:avc1.4D401E @322.183k
ID: 134 , 360p (640x360), fps:24, video:avc1.4D401E @124.331k
ID: 18 , 360p (640x360), fps:24, video:avc1.42001E @253.442k , audio:mp4a.40.2 @Nonek / 44100Hz
ID: 605 , (640x360), fps:24.0, video:vp09.00.21.08 @389.332k
ID: 243 , 360p (640x360), fps:24, video:vp09.00.21.08 @145.011k
ID: 397 , 480p (854x480), fps:24, video:av01.0.04M.08 @250.471k
ID: 231 , (854x480), fps:24.0, video:avc1.4D401E @429.038k
ID: 135 , 480p (854x480), fps:24, video:avc1.4D401E @191.733k
ID: 606 , (854x480), fps:24.0, video:vp09.00.30.08 @545.729k
ID: 244 , 480p (854x480), fps:24, video:vp09.00.30.08 @216.944k
ID: 22 , 720p (1280x720), fps:24, video:avc1.64001F @490.683k , audio:mp4a.40.2 @Nonek / 44100Hz
ID: 398 , 720p (1280x720), fps:24, video:av01.0.05M.08 @467.62k
ID: 232 , (1280x720), fps:24.0, video:avc1.4D401F @687.546k
ID: 136 , 720p (1280x720), fps:24, video:avc1.4D401F @361.567k
ID: 609 , (1280x720), fps:24.0, video:vp09.00.31.08 @749.798k
ID: 247 , 720p (1280x720), fps:24, video:vp09.00.31.08 @358.9k
ID: 399 , 1080p (1920x1080), fps:24, video:av01.0.08M.08 @802.0k
ID: 270 , (1920x1080), fps:24.0, video:avc1.640028 @2321.183k
ID: 137 , 1080p (1920x1080), fps:24, video:avc1.640028 @1366.135k
ID: 614 , (1920x1080), fps:24.0, video:vp09.00.40.08 @2002.656k
ID: 248 , 1080p (1920x1080), fps:24, video:vp09.00.40.08 @1071.792k
ID: 616 , Premium (1920x1080), fps:24.0, video:vp09.00.40.08 @2959.841k (matched) 

And TubeSync gives me the following Matched formats:

Combined: no match
Audio: 251 (exact match)
Video: 616 (fallback) 

Manually marking this media to be skipped and then unmarking it, as described in #336, does trigger the media to download correctly. But I am already on the latest docker image, with the most recent version of yt-dlp.

Is it just a yt-dlp issue that I have to live with until they release a new version, could there be a workaround?

Edit:

If there's a way to do this via docker exec, perhaps a workaround could be a command line execution that manually skips and then unskips all media which has failed due to no match? Not the ideal solution, but it could help for now? I'm not sure if such a command bakes in to the /app/manage.py that could do that.

meeb commented 1 year ago

Thanks for the issue. I can't replicate your problem with the channel you specified above, however as it's been reported by multiple people it's safe to assume it's a real issue. If I were to guess it appears to be that the formats available for a media item when first indexed by yt-dlp (called via the tubesync worker) are limited, resulting in no match. These formats are then updated and show the full list of formats at some unknown time later. The two situations this appears to occur is live streams when they have only just been converted to VODs and the media hasn't finished transcoding all the formats yet or older items which (total wild guess here with no evidence but seems logical) I'm assuming all but the most common formats drop out of some cache and then get restored when viewed.

I'll look into a command to refresh the metadata for items marked as failed to find match, that's probably a reasonably hacky fix for now. I did investigate pausing media then marking them to refresh metadata after 24h but the UI was overly confusing.

bawitdaba commented 1 year ago

ID: 616 , Premium (1920x1080), fps:24.0, video:vp09.00.40.08 @2959.841k (matched)

@meeb would the issue here be that it is matching Premium quality that isn't able to be downloaded? I think I am running into this issue

meeb commented 1 year ago

Yes, I've no experience to date with some formats being marked as premium but that could well be it. If some 1080p formats are now being paywalled at YouTube then this would need specific handling to avoid them. Thanks for the suggestion. Luckily, if that is the case just a "ignore formats with 'premium' labels" should be quite easy to add.

sischnei commented 10 months ago

I have the same issue - so far tried 3 random videos and neither of them downloaded automatically all failing with the described issue - one example would be this video https://www.youtube.com/watch?v=FAzonLpYhRA&list=PLi_p9-Lox3pc4Krym0t1fI1LqtrgW32pd&index=3&t=1s

I doubt that it has anything to do with VOD - this video is 4 years old.

TubeSync version 0.13.2 yt-dlp version 2023.11.16 FFmpeg version N-112750-g6d60cc7baf-20231114.

What bugs me from the logs is this line:

2023-11-29 14:47:44,614 [tubesync/INFO] Media: Download / FAzonLpYhRA has no published date, marking to be skipped

What is this "has no published date"?

meeb commented 10 months ago

"has no published date" means the upload_date isn't set or has an invalid value in the media items metadata returned from YouTube. There appears to be some variance in the metadata causing it to fail to parse for some people. I've not experienced this myself, so I'm relying on people testing things to find out what the problem is. See https://github.com/meeb/tubesync/issues/183 etc.

mitchross commented 10 months ago

I have this same exact issue on every youtube channel.

meeb commented 10 months ago

@mitchross updated to :latest? Check the version in your web UI footer, it should read 0.13.3. If not, update it.

mitchross commented 10 months ago

On the latest version 0.13.3

image image
meeb commented 10 months ago

Anything in your container logs? That looks like a complete failure to download any metadata at all.

mitchross commented 10 months ago

Anything in your container logs? That looks like a complete failure to download any metadata at all.

image

i have to force the codecs above

the defaults do not work !

image

This never used to be a issue.

then every time I add a source I have to force reset all tasks

image

Ive re-setup this app over 10 times now. There is a bug somewhere in this app.

meeb commented 10 months ago

@mitchross That is not your container logs, just some random screenshots. Container logs are the output from docker logs tubesync for example. Look for actual errors in the logs (don't share the whole massive log just the errors). Your descriptions and screenshots are not helpful in diagnosing what is wrong with your setup. Specifically choosing a codec while the fallback is set to "get next best" does nothing as if your selected codec is unavailable it'll choose the next best codec anyway so it's not what is "fixing" your downloads. Likely just the task resetting is fixing it probably. You can't debug this without seeing the actual error in the logs.

mitchross commented 10 months ago

Yea I know how docker logs work. I am a developer myself. The reason im showing the screenshots is because its the steps that work to fix the issue at times for me.

2023-12-08 13:59:08,111 [tubesync/WARNING] [youtube:tab] HTTP Error 404: Not Found. Retrying (2/3)...
2023-12-08 13:59:08,292 [tubesync/WARNING] [youtube:tab] HTTP Error 404: Not Found. Retrying (3/3)...
2023-12-08 13:59:08,465 [tubesync/WARNING] [youtube:tab] Unable to download webpage: HTTP Error 404: Not Found (caused by <HTTPError 404: Not Found>); please report this issue on  https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using  yt-dlp -U. Giving up after 3 retries
2023-12-08 13:59:08,626 [tubesync/WARNING] [youtube:tab] YouTube said: ERROR - Requested entity was not found.
2023-12-08 13:59:08,626 [tubesync/WARNING] [youtube:tab] HTTP Error 404: Not Found. Retrying (1/3)...
2023-12-08 13:59:08,708 [tubesync/WARNING] [youtube:tab] YouTube said: ERROR - Requested entity was not found.
2023-12-08 13:59:08,708 [tubesync/WARNING] [youtube:tab] HTTP Error 404: Not Found. Retrying (2/3)...
2023-12-08 13:59:08,803 [tubesync/WARNING] [youtube:tab] YouTube said: ERROR - Requested entity was not found.
2023-12-08 13:59:08,804 [tubesync/WARNING] [youtube:tab] HTTP Error 404: Not Found. Retrying (3/3)...
2023-12-08 13:59:08,883 [tubesync/WARNING] [youtube:tab] YouTube said: ERROR - Requested entity was not found.
2023-12-08 13:59:08,884 [tubesync/ERROR] ERROR: [youtube:tab] @mitchross2852: Unable to download API page: HTTP Error 404: Not Found (caused by <HTTPError 404: Not Found>); please report this issue on  https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using  yt-dlp -U
Rescheduling Index media from source "mitchross2852"
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/dist-packages/background_task/tasks.py", line 43, in bg_runner
    func(*args, **kwargs)
  File "/app/sync/tasks.py", line 168, in index_source_task
    videos = source.index_media()
             ^^^^^^^^^^^^^^^^^^^^
  File "/app/sync/models.py", line 561, in index_media
    response = indexer(self.index_url)
               ^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/sync/youtube.py", line 60, in get_media_info
    raise YouTubeError(f'Failed to extract_info for "{url}": No metadata was '
sync.youtube.YouTubeError: Failed to extract_info for "https://www.youtube.com/c/@mitchross2852/videos": No metadata was returned by youtube-dl, check for error messages in the logs above. This task will be retried later with an exponential backoff.
Rescheduling task Index media from source "mitchross2852" for 0:01:26 later at 2023-12-08 14:00:34.889700+00:00
Rescheduling Index media from source "whistlindiesel"
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/dist-packages/background_task/tasks.py", line 43, in bg_runner
    func(*args, **kwargs)
  File "/app/sync/tasks.py", line 168, in index_source_task
    videos = source.index_media()
             ^^^^^^^^^^^^^^^^^^^^
  File "/app/sync/models.py", line 561, in index_media
    response = indexer(self.index_url)
               ^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/sync/youtube.py", line 60, in get_media_info
    raise YouTubeError(f'Failed to extract_info for "{url}": No metadata was '
sync.youtube.YouTubeError: Failed to extract_info for "https://www.youtube.com/c/@whistlindiesel/videos": No metadata was returned by youtube-dl, check for error messages in the logs above. This task will be retried later with an exponential backoff.
Rescheduling task Index media from source "whistlindiesel" for 0:04:21 later at 2023-12-08 14:18:00.999037+00:00
2023-12-08 14:15:39,052 [tubesync/WARNING] [youtube:tab] HTTP Error 404: Not Found. Retrying (1/3)...
2023-12-08 14:15:39,254 [tubesync/WARNING] [youtube:tab] HTTP Error 404: Not Found. Retrying (2/3)...
2023-12-08 14:15:39,431 [tubesync/WARNING] [youtube:tab] HTTP Error 404: Not Found. Retrying (3/3)...
2023-12-08 14:15:39,617 [tubesync/WARNING] [youtube:tab] Unable to download webpage: HTTP Error 404: Not Found (caused by <HTTPError 404: Not Found>); please report this issue on  https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using  yt-dlp -U. Giving up after 3 retries
2023-12-08 14:15:39,812 [tubesync/WARNING] [youtube:tab] YouTube said: ERROR - Requested entity was not found.
2023-12-08 14:15:39,812 [tubesync/WARNING] [youtube:tab] HTTP Error 404: Not Found. Retrying (1/3)...
2023-12-08 14:15:39,884 [tubesync/WARNING] [youtube:tab] YouTube said: ERROR - Requested entity was not found.
2023-12-08 14:15:39,884 [tubesync/WARNING] [youtube:tab] HTTP Error 404: Not Found. Retrying (2/3)...
2023-12-08 14:15:39,956 [tubesync/WARNING] [youtube:tab] YouTube said: ERROR - Requested entity was not found.
2023-12-08 14:15:39,956 [tubesync/WARNING] [youtube:tab] HTTP Error 404: Not Found. Retrying (3/3)...
2023-12-08 14:15:40,033 [tubesync/WARNING] [youtube:tab] YouTube said: ERROR - Requested entity was not found.
2023-12-08 14:15:40,035 [tubesync/ERROR] ERROR: [youtube:tab] @mitchross2852: Unable to download API page: HTTP Error 404: Not Found (caused by <HTTPError 404: Not Found>); please report this issue on  https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using  yt-dlp -U
Rescheduling Index media from source "mitchross2852"
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/dist-packages/background_task/tasks.py", line 43, in bg_runner
    func(*args, **kwargs)
  File "/app/sync/tasks.py", line 168, in index_source_task
    videos = source.index_media()
             ^^^^^^^^^^^^^^^^^^^^
  File "/app/sync/models.py", line 561, in index_media
    response = indexer(self.index_url)
               ^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/sync/youtube.py", line 60, in get_media_info
    raise YouTubeError(f'Failed to extract_info for "{url}": No metadata was '
sync.youtube.YouTubeError: Failed to extract_info for "https://www.youtube.com/c/@mitchross2852/videos": No metadata was returned by youtube-dl, check for error messages in the logs above. This task will be retried later with an exponential backoff.
Rescheduling task Index media from source "mitchross2852" for 0:21:41 later at 2023-12-08 14:37:21.044601+00:00
Logs from 12/7/2023, 2:37:01 PM
meeb commented 10 months ago

You could always try being less confrontational while getting free support for some open source software.

Anyway, thanks to the container logs your issue is nothing to do with codecs. The issue is you've added channels which aren't actually channels but handles or username aliases. Internally at YouTube some "channels" that are usernames aren't actually channels but are meta-groups of other channels and playlists (sometimes) or just aliases to internal channel IDs. This is evident because your usage of channel names 404s when TubeSync attempts to access them as channels (e.g. https://www.youtube.com/c/@whistlindiesel/videos returns a 404). While https://www.youtube.com/@whistlindiesel/videos (removing the /c/) works, this doesn't really seem to be a channel either but an alias to one.

The current suggested workaround is to use a tool like:

https://www.streamweasels.com/tools/youtube-channel-id-and-user-id-convertor/

to convert the @handles / @usernames into channel IDs, then "add channel by ID" in TubeSync using the ID. For example the ID for @whistlindiesel is UCdqp0KK_Io7TwK5cJMBvB0Q (https://www.youtube.com/channel/UCdqp0KK_Io7TwK5cJMBvB0Q).

Convert your @handle "channels" to IDs and add those, they should work fine regardless of your codec choice. While it is technically possible to locate the IDs automatically and there is an experimental private branch with the feature YouTube kept changing stuff while I was building it so it's not stable or reliable enough useful at the moment.

mitchross commented 10 months ago

You came hard at me with being confrontational. I gave you it back.

I'll find another project that works out of the box without having to do all these work arounds.

meeb commented 10 months ago

I wasn't confrontational at all I just asked for the information to give you assistance and was verbose when you supplied something other than what I asked for. I mean, I'm not entirely sure what else you are expecting here? You're welcome I guess? The issue with channel IDs is YouTube itself changing, but sure feel free to not use the free software and ignore the free assistance that will make it work for you.

goose-ws commented 10 months ago

While I'm the original author of this issue, I really don't have a stake in any of the comments below my own, I built my own tool to get around this issue for the time being. That said, as an uninvolved third party, @meeb does not appear confrontational in any of his responses. I mention this as a reality check to reconsider your stance of hostility, he comes across as just trying to be helpful.

meeb commented 10 months ago

Cheers, @goose-ws - I've created a stand-alone issue which tracks your original report better now it's confirmed what your original issue was caused by. Even if you don't use TubeSync further your report was useful, thanks. https://github.com/meeb/tubesync/issues/446 for reference.

goose-ws commented 9 months ago

Thanks, good to know @meeb. My tool is just a script with some hard coded variables and very few sanity checks, so I'll probably switch back to your more polished tool in the soon-ish future. Thanks for sharing your hard work with us all, and offering support too. Cheers.

fishnux commented 7 months ago

Coming from my experience on Tube-Archivist, I suspect this issue could be related to rate-limits, as changing the IP address would often allow the download to go through. Those affected by this issue should confirm this hypothesis.

As taken from the Docker logs of a user above, special attention to the 404 errors:

2023-12-08 14:15:39,052 [tubesync/WARNING] [youtube:tab] HTTP Error 404: Not Found. Retrying (1/3)...
2023-12-08 14:15:39,254 [tubesync/WARNING] [youtube:tab] HTTP Error 404: Not Found. Retrying (2/3)...
2023-12-08 14:15:39,431 [tubesync/WARNING] [youtube:tab] HTTP Error 404: Not Found. Retrying (3/3)...
2023-12-08 14:15:39,617 [tubesync/WARNING] [youtube:tab] Unable to download webpage: HTTP Error 404: Not Found (caused by <HTTPError 404: Not Found>); please report this issue on  https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using  yt-dlp -U. Giving up after 3 retries
2023-12-08 14:15:39,812 [tubesync/WARNING] [youtube:tab] YouTube said: ERROR - Requested entity was not found.
2023-12-08 14:15:39,812 [tubesync/WARNING] [youtube:tab] HTTP Error 404: Not Found. Retrying (1/3)...
2023-12-08 14:15:39,884 [tubesync/WARNING] [youtube:tab] YouTube said: ERROR - Requested entity was not found.
2023-12-08 14:15:39,884 [tubesync/WARNING] [youtube:tab] HTTP Error 404: Not Found. Retrying (2/3)...
2023-12-08 14:15:39,956 [tubesync/WARNING] [youtube:tab] YouTube said: ERROR - Requested entity was not found.
2023-12-08 14:15:39,956 [tubesync/WARNING] [youtube:tab] HTTP Error 404: Not Found. Retrying (3/3)...