WebThingsIO / schemas

A Web of Things schema repository
https://webthings.io/schemas/
5 stars 8 forks source link

MediaPlayer Capability Schema #34

Open benfrancis opened 5 years ago

rzr commented 5 years ago

IIRC ocf has a model for player, playlist and such:

https://www.oneiota.org/documents?q=media

jaller94 commented 5 years ago

The Kodi project has a very long list of values and actions which can be called via HTTP and WebSocket using JSON-RPC. https://kodi.wiki/view/JSON-RPC_API/v9#Player_2

I think it’s another good source of inspiration.

jaller94 commented 5 years ago

A MediaPlayer should feature the basic interactions of:

Possibly optional(?):

Furthermore, MediaPlayers are usually devices which you wouldn’t leave on while they are not in use. Will IoT MediaPlayer have to manage this themselves as a spec could except them to be available at all times?

mrstegeman commented 4 years ago

From #32 (now closed):

MediaPlayer

A device that displays media.

Properties

Type Read-Only Required
VolumeProperty No Yes
PlayingProperty Possibly, if the thing does not support pausing/resuming Yes
ShuffleProperty No No
RepeatProperty No No
ProgressProperty Yes No
TrackProperty Yes No
ArtistProperty Yes No
AlbumProperty Yes No
PreviewProperty Yes No
MutedProperty No No

Actions

Type Required
StopAction No
PreviousAction No
NextAction No

VolumeProperty

type integer
unit percent

PlayingProperty

type boolean

ShuffleProperty

type boolean

RepeatProperty

type string
enum Off,One,All

ProgressProperty

type integer
unit seconds
maximum Seconds in the current Track

TrackProperty

type string

ArtistProperty

type string

AlbumProperty

type string

PreviewProperty

type null/undefined

Note: The primitive type of an PreviewProperty is null or undefined (omitted), but it must provide a mediaType in one or more link relations which link to binary representations of the preview property resource.

MutedProperty

type boolean

StopAction

Stops current playback on the thing.

input

None.

PreviousAction

Plays the previous item in the playback queue.

input

None.

NextAction

Plays the next item in the playback queue.

input

None.

tim-hellhake commented 3 years ago

Can we merge this PR? It looks pretty complete. I would love to add it to my addons.

mrstegeman commented 3 years ago

I'm happy enough with the PR (given that I worked on it, #32), but this one really should have a proper UI design.

@benfrancis any thoughts?

kgiori commented 3 years ago

For reference, I'm pasting screen shots from my use of the Sonos add-on. I normally leave it configured to stream from an Internet source (Radio Swiss Jazz), so the only two properties I actuate are "Playing" and "Volume". The biggest disadvantage of the Sonos add-on is the lack of a media player capability that let's me set "Playing" (or not) from the main icon. (As a workaround I created a virtual dimmable switch and tied it to this one using rules.) When this MediaPlayer Capability is available, will it be applied to add-ons like the Sonos one, and if so, would I be able to click the main icon to switch between playing/not playing? That would make me happy. :)

Another useful configuration option that @tim-hellhake added was to by default not show some of the properties so that the more important ones would fit (otherwise on smaller screens they'd overlap and you simply couldn't use them, whether normal size or zooming in or out).

Screen Shot 2021-01-07 at 9 58 25 AM Screen Shot 2021-01-07 at 9 59 42 AM

tim-hellhake commented 3 years ago

@benfrancis Any news on this? :grimacing:

benfrancis commented 3 years ago

The UI design for this has been on my queue for a while and I would like to get it done, perhaps we could even target the 1.1 release at the end of March? The reason it has taken me so long is that this is the most complex schema we've added.

Trying to design the UI has raised some questions for me about this schema:

  1. What happens if play and pause are not currently available because no piece of media is currently loaded up? We could have a read only state of the UI element representing the PlayingProperty, but readOnly is a static attribute of a property which can't currently be changed dynamically.
  2. How do you tell a media player device what to play? Should there be a play URL action as in the Sonos add-on? That probably wouldn't be possible with most devices though.
  3. The maximum attribute of a property is also a static value which can't be set dynamically based on the length of a piece of media. How do we set the maximum for ProgressProperty?
  4. What about videos? Is episode name or movie name equivalent to TrackProperty? Is series name equivalent to AlbumProperty? What about what season you're on? Is director equivalent to ArtistProperty? I can imagine many variations and extensions to this schema.
  5. What examples are there of devices this schema might represent? Chromecast is an interesting example where there's the concept of an "app" which is currently being used by the media player, but no obvious way to tell it what media to play programmatically (which would have to be specified in a separate app). What other examples are there? I can imagine many variations on this schema.
  6. I like the simplicity of RepeatProperty being a boolean, but in some applications there's a distinction between "repeat one" and "repeat all". Should RepeatPropertybe a string with an enum instead, or should we just stick to boolean?

I have posted a rough first UI design in https://github.com/WebThingsIO/gateway/issues/1628 which demonstrates the problem of having so many properties an actions in one schema, with a possible solution. There are some issues with this approach though, which should be discussed until we're sure which issues should be solved in the schema/API implementation and which issues should be solved in the UI.

Thoughts?

flatsiedatsie commented 3 years ago

You mentioned this capability might be used for bi-directional communication, such as through a video doorbell? I don't really see any audio recording abilities though?

More on topic: for playing radio it might be nice to have a station property?

benfrancis commented 3 years ago

@flatsiedatsie wrote:

You mentioned this capability might be used for bi-directional communication, such as through a video doorbell? I don't really see any audio recording abilities though?

This is kind of related to my comment about "How do you tell a media player device what to play?". I wonder if it could be possible to tell the media player to play a video/audio stream from a URL, which might be one way of approaching that, but is a bit clunky.

Or maybe there should be a separate tanoy/speaker capabilty specifically for streaming audio to a device like a bluetooth speaker (or doorbell/camera talkback feature). Note that this would be streaming, not recording (i.e. saving to disk).

More on topic: for playing radio it might be nice to have a station property?

This has always been the problem with this capability, there are just so many variations on what could be considered a media player. Where do you stop? FM/AM radio? DAB radio? MP3 player? Streaming stick? DVD/BluRay player? Cable/satellite/terestrial/IPTV set-top box? DVR? Smart TV? and so on.

Maybe what we need to do is pick the lowest common denominator features among all of these different types of media player and make that the MediaPlayer capability, then create various additional (composable) capabilities which are specialisations of MediaPlayer, in the same we do for specialisations of BinarySensor.

For example, maybe TrackProperty, ArtistProperty and AlbumProperty should be part of an AudioPlayer capability which is a specialisation of MediaPlayer. ShuffleProperty and RepeatProperty are also arguably fairly specific to audio players. Some of the media players above might have channel up/channel down rather than next and previous.

There could be a MediaRecorder capability which adds a RecordAction. Fast-forward and rewind are also missing, though they could just be UI abstractions on top of ProgressProperty.

What is a reasonable lowest common denominator for a media player (bearing in mind that some affordances can be optional)?

flatsiedatsie commented 3 years ago

Where do you stop? FM/AM radio? DAB radio? MP3 player? Streaming stick? DVD/BluRay player? Cable/satellite/terestrial/IPTV set-top box? DVR? Smart TV? and so on.

It seems to me all those things are covered pretty well by the existing spec. Save for the radio needing to say something about which stream/station is being played. So I'd stop after adding a property that denotes the string name of the stream/station/file that is playing.

Alternatively, why have artist and album as properties at all, and not just a generic 'info string'? That can show all kinds of information about what it being played (e.g. artist & album, film title, CCTV footage camera name), by all kinds of hardware?

I agree that using a mediaplayer capability for bi-directional communication of baby monitors, intercoms or smart doorbells is very clunky. It might make more sense to add that to videoCamera, or add it as a separate capability all together.

flatsiedatsie commented 3 years ago

Looking at that Sonoff screenshot, and this thermostat screenshot, I'd also say it's high time to move away from the octopus design for the UI. But that's another story.

benfrancis commented 3 years ago

It seems to me all those things are covered pretty well by the existing spec. Save for the radio needing to say something about which stream/station is being played.

I agree the basics are there, though there are obviously missing features for some devices (like how to change the radio station/TV channel and all the other kinds of features you typically find on a TV/Hi-Fi remote control), and features which don't apply to all devices (like shuffle and repeat).

Regarding media metadata, Chromecast's MediaMetadata object provides some prior art here which attempts to cover various metadata for music, TV, movies and eBooks. Also schema.org has a whole collection of MediaObject schemas which address this area including AudioObject, VideoObject, Audiobook, Episode, Movie, MovieSeries, RadioSeries, TVSeries,MusicVideoObject...

Alternatively, why have artist and album as properties at all, and not just a generic 'info string'?

That is certainly one solution to the problem, but could be quite limiting. What do other people think? Is it better to have a single generic string, or keep media metadata out of the schema altogether?

My instinct here is to remove TrackProperty, ArtistProperty, AlbumProperty and PreviewProperty from the schema because they are really metadata about the media being played, rather than the media player device itself.

This would answer my question 4, but probably at least questions 1-3 still need an answer before we can land this schema.

If there's no generic answer to question 2 (How do you tell a media player device what to play?) that's fine, but the schema will only be useful for controlling a media player which is already playing back some media, having initiated playback from some other physical control interface or software application. This is similar to how you can use a TV remote to control playback on a TV streaming stick like a Chromecast via CEC over HDMI, but have to use a separate app to select what to play.

kgiori commented 3 years ago

My favorite way to select what to stream is how the Internet Radio add-on does it. It has example free media streams and you can configure more of your own, delete what is there, etc. The biggest hurdle for users will be figuring out the exact URLs to add. But I suppose a wiki page could be maintained, or an add-on could hyperlink to a list of known and popular audio streams. Or those aspects could be configured via a "Faceplate" UI as described in some other thread by madbilly.

The Sonos play URI action is better than nothing, but I don't have the URLs handy that I need to insert if I want to change the stream, so I use the Sonos app for changing the input stream instead. I wonder if a rule could be created to feed input streams into that play URI action... and if so, what that input thing would be.

flatsiedatsie commented 3 years ago

A "search input" property might make sense?

But then you get people like me who would like to select a file to play. Some people may want to play a URL.

So perhaps the bigger question is: is this stuff intended to be a UI, or an API for a real UI? If it's supposed to describe all possible UI components, where does it stop (a sentiment Ben expressed earlier). Add a playlist property?

If it should just describe the current state of a separate device that is playing music, then you get closer to the current spec.

So in my view a UI extension addon would be the media player, and this addon may have a thing representing it which can be used to hook things into. E.g. create rules, do voice queries, or allow a remote control device to control the player.

For example, I've been meaning to add a real UI to the internet radio addon, where users can search for radio stations, and keep their favourites.

Should RepeatProperty be a string with an enum instead, or should we just stick to boolean?

Make it an enum and add random as an option?

kgiori commented 3 years ago

So in my view a UI extension addon would be the media player, and this addon may have a thing representing it which can be used to hook things into. E.g. create rules, do voice queries, or allow a remote control device to control the player.

I like the idea that I would be able to do more with my entertainment things. Right now I play or not, and change the volume. But if I could set the source easily, via rules, voice, buttons, etc., that would be cool.

benfrancis commented 3 years ago

I know it's hard not to conflate the two, but this issue is just about the schema design at the API level which could theoretically be used by any Web of Things implementation. It should not be specific to WebThings or WebThings Gateway (like the schemas at schema.org and iotschema.org).

https://github.com/WebThingsIO/gateway/issues/1628 is about an implementation of this schema in WebThings Gateway and what the default user interface should be for a web thing which uses the MediaPlayer schema and doesn't provide its own UI (as most web things currently don't).

I've drafted a tweaked proposal below which I've tried to cut down to the lowest common denominator for all media players, and answer the questions I asked above. This continues to be tricky and I'm still not completely happy with this.

MediaPlayer

A device that plays audio or video.

Properties

null
Type Read-Only Required
VolumeProperty No Yes
PlayingProperty No Yes
ProgressProperty No No
MutedProperty No No
PlayingMediaTitleProperty No No
PlayingMediaLengthProperty No No

Note: PlayingProperty may be read-only if a device does not support pausing/resuming playback. ProgressProperty may be read-only if a device does not support skipping to a different point in media playback.

Actions

Type Required
StopAction No
PreviousAction No
NextAction No

VolumeProperty

The current volume of audio being played.

type number
unit percent

PlayingProperty

Whether or not audio or video media is currently being played.

type boolean

Note: If no media title is currently loaded, PlayingProperty may be set to null to indicate that playback can currently neither be paused nor resumed.

ProgressProperty

The progress towards completion of a long-running task.

type number
unit percent

PlayingMediaTitleProperty

The title of a piece of media currently being played by a media player. (e.g. the title and artist of a song or the title of a movie).

type string

PlayingMediaLengthProperty

The total length of a piece of media currently being played by a media player.

type integer
unit seconds

MutedProperty

type boolean

StopAction

Stops playback of media.

input

None.

PreviousAction

Plays the previous item in the playback queue.

input

None.

NextAction

Plays the next item in the playback queue.

input

None.


I've made the following changes:

There is currently no solution to question 2 (how to tell a device what to play), which is currently left to an external physical or virtual user interface as there's no obvious common solution for all types of media player.

Note that none of this prevents the authors of web things from adding properties for any of those that I've removed above, they're just not part of the generic schema.

Footnotes:

  1. If we start defining properties like this then it's easy to end up describing the internal logical state of software applications (web apps) rather than external physical state of devices (web things). Attempting to describe the entire physical world is already a big enough challenge! This is a tricky distinction which is already compromised by the ColorModeProperty, which I didn't review and I'm not sure should be in the schema, but does actually describe a physical state for some bulbs which use different LED arrays for different colour modes.
  2. The original proposal actually did make repeat an enum, not a boolean. I misread.
  3. This arguably also strays into internal logical state of software rather than external physical state of a device.
flatsiedatsie commented 3 years ago

Looks good.

Some small thoughts:

flatsiedatsie commented 3 years ago

Perhaps a practical case: right now I'd love to be able to let the user select which device in their home network should play a "ding dong" sound then someone is at their door.

Similarly, it could be nice to let users select which device in their home should play a radio stream. Or which devices should play an alarm sound when there's a fire.

I do also realise that sending a file to a device to play may not always work. It assumed that endpoints can handle these requests. E.g. to play an MP3 file or radio stream, a device must be able to play mp3 files.

benfrancis commented 3 years ago

Note to self:

benfrancis commented 3 years ago

@flatsiedatsie wrote:

You mention a paused state, but there is no pause action?

This is intentional, the PlayingProperty is writable and can be mapped onto a play/pause button.

Having some kind of input string/URL ability might be cool

I agree it would be cool.

I do also realise that sending a file to a device to play may not always work.

This is the problem - most existing devices probably wouldn't support this feature.

I'm going to take a look at existing APIs in the HTML and Remote Playback specifications to see whether there are any useful patterns that could be followed.