home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
71.07k stars 29.73k forks source link

Media Extractor fails on YouTube links #121679

Open lwsrbrts opened 1 month ago

lwsrbrts commented 1 month ago

The problem

In a nutshell, Media Extractor is raising an error when it is fed a YouTube link. This is most likely related to a warning raised yt-dlp that seems to be resolved with the latest 2024.7.9 version. HA is currently using (defining) 2024.07.01 as the version required.

What version of Home Assistant Core has the issue?

core-2024.7.2

What was the last working version of Home Assistant Core?

No response

What type of installation are you running?

Home Assistant OS

Integration causing the issue

media_extractor

Link to integration documentation on our website

No response

Diagnostics information

No response

Example YAML snippet

No response

Anything in the logs that might be useful for us?

From the Logs:

[youtube] oav5D6X0Ecs: nsig extraction failed: Some formats may be missing n = VFfWJyJN9QPW58hrb ; player = https://www.youtube.com/s/player/b22ef6e7/player_ias.vflset/en_US/base.js
[youtube] oav5D6X0Ecs: nsig extraction failed: Some formats may be missing n = XDpRT24b54AU2WZ0E ; player = https://www.youtube.com/s/player/b22ef6e7/player_ias.vflset/en_US/base.js
[youtube] pjVMnq-v4xs: nsig extraction failed: Some formats may be missing n = ZCR-3GwwmEpHNStD0 ; player = https://www.youtube.com/s/player/b22ef6e7/player_ias.vflset/en_US/base.js
[youtube] pjVMnq-v4xs: nsig extraction failed: Some formats may be missing n = rQP2yYXlgGFgj_luy ; player = https://www.youtube.com/s/player/b22ef6e7/player_ias.vflset/en_US/base.js

Followed by:

Logger: homeassistant.helpers.script.websocket_api_script
Source: helpers/script.py:527
First occurred: 14:46:23 (2 occurrences)
Last logged: 14:47:58

websocket_api script: Error executing script. Unexpected error for call_service at pos 1: list index out of range
Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 527, in _async_step
    await getattr(self, handler)()
  File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 764, in _async_call_service_step
    response_data = await self._async_run_long_action(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 727, in _async_run_long_action
    return await long_task
           ^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/core.py", line 2731, in async_call
    response_data = await coro
                    ^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/core.py", line 2774, in _execute_service
    return await target(service_call)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/media_extractor/__init__.py", line 123, in extract_media_url
    url = get_best_stream_youtube(selected_media["formats"])
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/media_extractor/__init__.py", line 315, in get_best_stream_youtube
    return get_best_stream(
           ^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/media_extractor/__init__.py", line 306, in get_best_stream
    return cast(str, formats[len(formats) - 1]["url"])
                     ~~~~~~~^^^^^^^^^^^^^^^^^^
IndexError: list index out of range


### Additional information

Likely just needs a bump to the newest version.

I used an older version of yt-dlp (binary) and see the warning in its output but the download still worked.
I updated yt-dlp to the latest version 2024.7.9 and ran the same command again but didn't receive a warning so I _assume_ this issue was fixed in yt-dlp.
griogar commented 1 month ago

Also seeing this. HA debug log says it can't play the extracted URL, but nodered throws an explicit list index out of range error

lwsrbrts commented 1 month ago

Resolved in 2024.7.3.

lwsrbrts commented 1 month ago

Looks like this might need another yt-dlp version bump to 2024.7.25 as this issue has returned today(!) and the version released today resolves it also. πŸ˜”

home-assistant[bot] commented 1 month ago

Hey there @joostlek, mind taking a look at this issue as it has been labeled with an integration (media_extractor) you are listed as a code owner for? Thanks!

Code owner commands Code owners of `media_extractor` can trigger bot actions by commenting: - `@home-assistant close` Closes the issue. - `@home-assistant rename Awesome new title` Renames the issue. - `@home-assistant reopen` Reopen the issue. - `@home-assistant unassign media_extractor` Removes the current integration label and assignees on the issue, add the integration domain after the command. - `@home-assistant add-label needs-more-information` Add a label (needs-more-information, problem in dependency, problem in custom component) to the issue. - `@home-assistant remove-label needs-more-information` Remove a label (needs-more-information, problem in dependency, problem in custom component) on the issue.

(message by CodeOwnersMention)


media_extractor documentation media_extractor source (message by IssueLinks)

lwsrbrts commented 1 month ago

Given I'm stopped from playing anything from YouTube while the integration gets a version bump, I decided to modify my own manifest.json to get it working for me.

I know it won't survive an update and is frowned upon but I assume the next version released will see the bump to yt-dlp anyway.

For anyone interested and skilled enough in doing this that's using Home Assistant OS version. Here's what I did.

Be aware that the above will not survive a version upgrade to Home Assistant but, as I said, I expect this will be resolved in the next release anyway (didn't get it with 2024.7.4 from today so I needed to do this from my point of view).

I know this is frowned on etc. but for anyone stuck and needing a workaround now without waiting for the next version, this should get media_extractor.play_media working again with YouTube links.

griogar commented 1 month ago

The above doesn't seem to work with non HAOS docker installs. It downloads the version in /usr/local/lib/python3.12/site-packages/yt_dlp, but the behavior persists.

lwsrbrts commented 1 month ago

The above doesn't seem to work with non HAOS docker installs. It downloads the version in /usr/local/lib/python3.12/site-packages/yt_dlp, but the behavior persists.

Yep, I only use Home Assistant OS hence my warning that you'd need to be using that for it to work.

That said.... Youtube seem to be playing whack-a-mole and now the 2024.07.25 version of yt-dlp has the same issue but notably, it still works to produce an output/download so there is a fallback for that at least.

I do wish I knew enough about the code since it seems it's something to do with how the format/file is retrieved from what yt-dlp produces and then hands off to the integration. formats(len-1) or whatever the error says just seems like it could do with something a bit more robust.

lwsrbrts commented 1 month ago

The above doesn't seem to work with non HAOS docker installs. It downloads the version in /usr/local/lib/python3.12/site-packages/yt_dlp, but the behavior persists.

I'm not sure if custom_components works the same for your version of HA but you could create your own "custom_component" of media_extractor by taking a copy of the media_extractor folder, or literally downloading the files in the folder from source and place that folder (called media_extractor) in your custom_components directory, modify the manifest.json and reload.

Of note, you must add a version number to the manifest.json when it's a custom component or it won't load. My manifest.json would look like this:

{
  "domain": "media_extractor",
  "name": "Media Extractor",
  "codeowners": ["@joostlek"],
  "config_flow": true,
  "dependencies": ["media_player"],
  "documentation": "https://www.home-assistant.io/integrations/media_extractor",
  "iot_class": "calculated",
  "loggers": ["yt_dlp"],
  "quality_scale": "internal",
  "requirements": ["yt-dlp==2024.07.25"],
  "single_config_entry": true,
  "version": "2024.7.25"
}

So you need a folder: custom_components\media_extractor in to which you download from source or copy the files making up the integration, modify the manifest.json as above and restart HA.

griogar commented 1 month ago

Ive been exploring the custom_component route, but without media extractor. This continuing whack-a-mole is exhausting

joostlek commented 1 month ago

Yt-dlp just release 2024.8.1, I'll bump that for the beta. I'll try to find some time during the beta to fix some issues. Anyone up to help me with this?

griogar commented 1 month ago

So 2024.08.01 works on my HA docker, 2024.07.25 does not. Is this a weekly whack-a-mole thing?

lwsrbrts commented 1 month ago

I would certainly but my lack of experience with Python might be more of a hindrance. I can certainly help test.

Leaning heavily on ChatGPT today (don't judge) I was attempting to directly use the yt-dlp library (not using HA at all) to see whether or not I could get something despite the errors. Interestingly, although I was still getting the same nsig warnings, I could still download from YT successfully. If however I did list-formats, I was only shown results containing audio only or video only . I get the impression that is the crux of handling the issue in HA/media extractor, the fact that there's no muxed audio/video stream available, which led me down another avenue...

I then wondered to myself whether or not I could provide a query selector like bestaudio to media extractor (in HA) since it was only audio I was interested in, but it didn't seem to matter and, truth be told, I'm not even sure the query selector had any effectand was simply generating the same errors.

I did bodge some Python together (err, thanks ChatGPT?) to see if there was anything in the formats list and, based on my testing, there wasn't. Again, this didn't matter whether I tried providing bestaudio/best or any other selector I tried.

def get_best_stream(formats: list[dict[str, Any]]) -> str:
    """Return the best quality stream.

    As per
    https://github.com/yt-dlp/yt-dlp/blob/master/yt_dlp/extractor/common.py#L128.
    """

    #return cast(str, formats[len(formats) - 1]["url"])

    if not formats:
        _LOGGER.error("Formats list is empty. No streams available.")
        raise IndexError("Formats list is empty. No streams available.")
        return

    try:
        return cast(str, formats[len(formats) - 1]["url"])
    except IndexError as e:
        _LOGGER.error("Error accessing formats list: %s", e)
        _LOGGER.error("Available formats: %s", formats)
        raise MEQueryException from err
joostlek commented 1 month ago

Maybe we have to revision the way you use selectors because they do not do what we expect them to do.

Last week a really good DJ set got livestreamed, so I now want to download it to use with Music Assistant, so now I have motivation to fix this πŸ˜‚

griogar commented 1 month ago

Same as @lwsrbrts . Would love to help, but I'm more a node/TS dev and wouldn't be much help on python

lwsrbrts commented 1 month ago

So 2024.08.01 works on my HA docker, 2024.07.25 does not. Is this a weekly whack-a-mole thing?

Probably just related to YT releasing new player code that causes yt-dlp to not be able to interact with it or read what it's putting out.

https://github.com/yt-dlp/yt-dlp/issues/10608

lwsrbrts commented 1 month ago

The whack-a-mole continues. Seems like Youtube are updating their code every week and causing yt-dlp to have to do the same. Now at version 2024.8.6 πŸ™„

joostlek commented 1 month ago

Bumped that version for the beta

lwsrbrts commented 1 month ago

All I can say is it's a good job you moved it over to yt-dlp from youtube-dl when you did! Thanks for doing that.

joostlek commented 1 month ago

Oh I did not, that happened before I worked on home assistant :)

lwsrbrts commented 1 month ago

I think we can close this now as it seems like it'll continue to raise its head...unless you want to keep it around as a reminder?

griogar commented 1 month ago

I’ve resorted to a shell script and cronjob that checks the repo, updates the manifest and restarts HA. πŸ˜€

skewll commented 3 weeks ago

Neither of the services work for me. Both throw an error at the same place. Missing formats.

2024-08-17 21:30:51.704 DEBUG (SyncWorker_46) [homeassistant.components.media_extractor] [youtube] dQw4w9WgXcQ: Downloading ios player API JSON 2024-08-17 21:30:51.887 DEBUG (SyncWorker_46) [homeassistant.components.media_extractor] [youtube] dQw4w9WgXcQ: Downloading player 53afa3ce 2024-08-17 21:30:52.834 WARNING (SyncWorker_46) [homeassistant.components.media_extractor] [youtube] dQw4w9WgXcQ: nsig extraction failed: Some formats may be missing n = d_UmP0B6r-dQ87SkL ; player = https://www.youtube.com/s/player/53afa3ce/player_ias.vflset/en_US/base.js 2024-08-17 21:30:52.844 WARNING (SyncWorker_46) [homeassistant.components.media_extractor] [youtube] dQw4w9WgXcQ: nsig extraction failed: Some formats may be missing n = lFOn2HRQnNQEDr9mO ; player = https://www.youtube.com/s/player/53afa3ce/player_ias.vflset/en_US/base.js 2024-08-17 21:30:53.569 DEBUG (SyncWorker_46) [homeassistant.components.media_extractor] [youtube] dQw4w9WgXcQ: Downloading m3u8 information

lwsrbrts commented 3 weeks ago

I've just tried it on Home Assistant 2024.8.2 and it is working as expected for me.

You could try to see if it works with the executable version of yt-dlp as well, ensuring you're using the latest version.

What does your service call YAML look like (URL included) ?

skewll commented 2 weeks ago

service: media_extractor.extract_media_url

data:

url: https://www.youtube.com/watch?v=dQw4w9WgXcQ

format_query: best

On Sun, Aug 18, 2024, 12:13 AM Lewis Roberts @.***> wrote:

I've just tried it on Home Assistant 2024.8.2 and it is working as expected for me.

You could try to see if it works with the executable version of yt-dlp as well, ensuring you're using the latest version.

What does your service call YAML look like (URL included) ?

β€” Reply to this email directly, view it on GitHub https://github.com/home-assistant/core/issues/121679#issuecomment-2295156214, or unsubscribe https://github.com/notifications/unsubscribe-auth/ARQFDKAASP2QQZ7HQBKLD5TZSBCSNAVCNFSM6AAAAABKU7WYQKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOJVGE2TMMRRGQ . You are receiving this because you commented.Message ID: @.***>

skewll commented 2 weeks ago

Upgraded to latest stable docker version and can see the new 'ACTIONS' title in dev tools and its all working again for me. Thanks a ton!

On Sun, Aug 18, 2024, 6:21 AM McGavin Eckert @.***> wrote:

service: media_extractor.extract_media_url

data:

url: https://www.youtube.com/watch?v=dQw4w9WgXcQ

format_query: best

On Sun, Aug 18, 2024, 12:13 AM Lewis Roberts @.***> wrote:

I've just tried it on Home Assistant 2024.8.2 and it is working as expected for me.

You could try to see if it works with the executable version of yt-dlp as well, ensuring you're using the latest version.

What does your service call YAML look like (URL included) ?

β€” Reply to this email directly, view it on GitHub https://github.com/home-assistant/core/issues/121679#issuecomment-2295156214, or unsubscribe https://github.com/notifications/unsubscribe-auth/ARQFDKAASP2QQZ7HQBKLD5TZSBCSNAVCNFSM6AAAAABKU7WYQKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOJVGE2TMMRRGQ . You are receiving this because you commented.Message ID: @.***>