OpenVoiceOS / ovos-ocp-audio-plugin

Apache License 2.0
4 stars 6 forks source link

Refactor SEI to use pure URI #114

Closed NeonDaniel closed 5 months ago

NeonDaniel commented 6 months ago

The Stream Extractor ID prefix on media URIs looks like and has previously been confused as being a URI scheme.

I would propose updating this spec to simply use a URI scheme so we pass standard URIs around. This behavior is standard for linking content to a particular application or identifying the type of content (Windows examples).

To migrate to using URIs, we would simply need to parse the scheme:

scheme, path = uri.split(':', 1))
if len(scheme.split('//') > 1):
    log_deprecation("Update plugin to pass URI without SEI prefix", "0.1.0")
    scheme = scheme.split('//')[0]
    uri = f"{scheme}://{path}"
JarbasAl commented 6 months ago

this was a conscious decision to make it different from a URI, since a real uri comes after the SEI

today i would probably implement it differently, but when it started we needed compat with mycroft and could not easily introduce new fields etc

for now this needs to stay as many many skills in the wild depend on it, was one of the early features of OCP (the purpose being moving the extracting to playback time, not search time, greatly improving search time)

if it changes in the future, its most likely to become a new field in message.data (also allowing for a slow migration)

NeonDaniel commented 6 months ago

for now this needs to stay as many many skills in the wild depend on it, was one of the early features of OCP (the purpose being moving the extracting to playback time, not search time, greatly improving search time)

I suggested parsing to maintain backwards-compat. so an old skill or plugin would have time to update to pass a regular URI.

if it changes in the future, its most likely to become a new field in message.data (also allowing for a slow migration)

Why add a separate field when there's an existing spec to pass that information in the URI?

My reasoning is that if we're just passing a reference to some media, a URI is universal by definition and we should use established standards. Adding custom parsing to file references makes it less clear what is happening and harder to integrate with other modules on the bus. For example, if we have a youtube:// URI, that can be passed directly to a client device like a phone to open the link in the user's preferred app.

JarbasAl commented 6 months ago

those are different concepts and not really equivalent, these are internal OCP ids, that identify how to get a real stream, those are not meant to make sense outside of OCP, and not mean to be universal resource identifiers. a SEI is tied to a specific plugin running in OCP

they dont identify a resource, they tell OCP "you must send this to a plugin before playback to get the URI", they are not valid uris and not meant to be passed to client devices like phones...

I expect if i hit a URI i get always the same thing back, it is a UNIQUE identifier, if you get a SEI, each time you run it you might get a different thing to play (eg. the latest stream in a RSS feed)

these are not file references, those are instructions how to find the real file when using the specific piece of code running in OCP, which is not expected to be available in clients by definition, this is internal and not meant for external consumption. SEIs are meant to be handled before anything makes it to the client, the client gets a real regular URI for playback (a client usually being the audio subsystem, but can be any client app like hivemind)

in short, a SEI is only meant to exist between skill result and OCP receiving the result, once anything is asked to do the actual playback by OVOS then a real URI is used

NeonDaniel commented 6 months ago

they dont identify a resource, they tell OCP "you must send this to a plugin before playback to get the URI", they are not valid uris and not meant to be passed to client devices like phones...

These are used in a uri field.

I expect if i hit a URI i get always the same thing back, it is a UNIQUE identifier, if you get a SEI, each time you run it you might get a different thing to play (eg. the latest stream in a RSS feed)

A Uniform Resource Identifier is simply a standard scheme for identifying something. I would fully expect a URI for a feed to return the latest thing at the time I requested it.

these are not file references, those are instructions how to find the real file when using the specific piece of code running in OCP, which is not expected to be available in clients by definition, this is internal and not meant for external consumption. SEIs are meant to be handled before anything makes it to the client, the client gets a real regular URI for playback (a client usually being the audio subsystem, but can be any client app like hivemind)

A skill doesn't know if a stream extractor will be required, only the client knows if it can play a youtube link directly. One client may want a youtube:// link to handle, while another might want that extracted stream to play directly. I think a client specifying supported URIs makes more sense than SEIs here since URI is an established standard that the client already knows about, without any context of what OCP is.

in short, a SEI is only meant to exist between skill result and OCP receiving the result, once anything is asked to do the actual playback by OVOS then a real URI is used

I'm proposing a change here then so that whatever is performing playback can decide if a URI needs to be passed through an stream extractor, or if the result can be played back directly.

JarbasAl commented 6 months ago

A skill doesn't know if a stream extractor will be required, only the client knows if it can play a youtube link directly.

the skill explicitly knows a stream extractor is required, a skill only sends a SEI to request a specific plugin to be used because the skill knows it is not sending a valid uri. the whole point of SEI is for skills to require a OCP plugin. otherwise the skill would send a regular uri out of the box (like most do)

i do want to allow SEI to come from Session, so clients can indicate which OCP plugins they are running natively, clients are free to re-implement the SEI plugins (eg, if in a different programming language), but need to follow the OCP standard.

from ovos-core side, results are filtered if a SEI can not be handled, they never reach the client

NeonDaniel commented 6 months ago

the skill explicitly knows a stream extractor is required, a skill only sends a SEI to request a specific plugin to be used because the skill knows it is not sending a valid uri. the whole point of SEI is for skills to require a OCP plugin. otherwise the skill would send a regular uri out of the box (like most do)

When would a skill return something for an SEI that doesn't contain a valid URI or the data to build one? In the case of youtube, the skill doesn't know if the playback requires a plugin to play a video on youtube or if the client can handle it directly.

i do want to allow SEI to come from Session, so clients can indicate which OCP plugins they are running natively, clients are free to re-implement the SEI plugins (eg, if in a different programming language), but need to follow the OCP standard.

In chis case, I'm arguing that the client should be advertising a URI scheme it supports, rather than asking them to know about an OCP standard that ostensibly serves the same purpose.

JarbasAl commented 6 months ago

When would a skill return something for an SEI that doesn't contain a valid URI or the data to build one? In the case of youtube, the skill doesn't know if the playback requires a plugin to play a video on youtube or if the client can handle it directly.

a skill can return sdgshhd//knshodvbaluinçg for all we care, sdgshhd specifies a required plugin to make sense of this (how the plugin works isnt necessarily open or known in advance)

what comes after a SEI is not meant to be intrepertable by anything other than the targeted OCP plugin

as for the plugin example, the skill knows it is reporting a audio stream and not sending a audio url, the plugin needs to extract the audio url

if you make a youtube skill that returns raw urls and uses the PlaybackType.WEBVIEW then thats the use case you are talking about

In chis case, I'm arguing that the client should be advertising a URI scheme it supports, rather than asking them to know about an OCP standard that ostensibly serves the same purpose.

running the OCP extractors in a client is currently unsupported, but even once supported it is requiring the client to be loading the specific OCP extractor, or reimplement it, thats up to the client and not OVOS concern. If the client loads the OCP plugins the client doesnt need to know about what a SEI is at all

it isnt a uri, its just a encoded payload for arbitrary code, it is a OCP specific standard, a client may be compatible with this or not, but that doesnt make it a real uri, it is not universal but it is tied to a OCP plugin, and only the OCP plugin is supposed to make sense of it.

by moving SEI to Session as mentioned, the client just reports what ocp extractor plugins they support , a client is free to not load a OCP plugin and implement it's own parsing ofc, as long as it tells OVOS it is running the required plugin

JarbasAl commented 6 months ago

I'm not discounting the confusion to be clear, i'm just arguing my earlier suggestion of moving it to it's own section in the payload

ie, instead of arguing that it should be made into a proper uri, my opinion is it should stop lying about being one

{
"title": "XXX",
"uri": "scheme:a_real_uri",
"required_plugin": "sei"
(....)
}

something like this, but needs more thinking, as sei payloads are not necessarily uris....

one option is to do like playlists, which dont use the "uri" key at all, and introduce a new appropriately named key for the cases using a stream extractor, effectively allowing 3 kinds of results to be returned (Media, Playlist, PluginPayload)

NeonDaniel commented 6 months ago

if you make a youtube skill that returns raw urls and uses the PlaybackType.WEBVIEW then thats the use case you are talking about

Kind of; I'm saying that a single youtube skill should be able to return youtube:// links and let the client decide how to play that back. A client with a youtube app (or multiple apps supporting youtube) could pick based on the user's preference. Another example would be Plex where the user may choose to use either a Plex or Plexamp app on a particular client.

something like this, but needs more thinking, as sei payloads are not necessarily uris....

The sei payload could be defined as a URI; something like sei://plugin/payload or sei://plugin?payload? That would allow us to continue passing URIs as we currently are and make everything in that field a URI.

one option is to do like playlists, which dont use the "uri" key at all, and introduce a new appropriately named key for the cases using a stream extractor, effectively allowing 3 kinds of results to be returned (Media, Playlist, PluginPayload)

This would work too; my main concern is about things like Youtube, where some clients may want or need to use a stream extractor plugin, but others could handle the media natively. If we can use a URI for everything, then the skill can not care in many cases and we leave it to OCP/Media service to decide how a selected result is presented to the client for playback.

For example, a client could specify that it supports the youtube scheme, so youtube://<video_id> could be returned to the client to play back. A different client that doesn't support youtube URIs could get whatever the stream extractor plugin produces.

JarbasAl commented 6 months ago

the fact we cant agree here makes it clear we should move away from the uri IMHO, please see the open PRs linked, that contain only enough to update skills

ovos-media/old-OCP dont handle the new class yet so it is converted to the old style when emitting in the OCPInterface

if a client isnt using OVOS code, its not OVOS concern, up to the client to follow our bus message spec, namely the OCPInterface and ClassicAudioServiceInterface

(utils needs to go first, then bus client pinning that utils version, then core pinning that bus client...)

JarbasAl commented 5 months ago

closing for now, a few PRs started addressing this already, further improvements should be handled together with ovos-media after 0.0.8