radiorabe / crid-spec

Define how we use RFC 4078 (CRID)
https://radiorabe.github.io/crid-spec/
Creative Commons Attribution Share Alike 4.0 International
0 stars 0 forks source link

Define RFC 4078 (CRID) spec #1

Closed hairmare closed 1 year ago

hairmare commented 2 years ago

RFC 0878 defines the The TV-Anytime Content Reference Identifier (CRID) Uniform Resource Locator (URL) scheme (crid:). CRID URLs are references to current or future scheduled publications of broadcast media content over television and radio distribution platforms and the Internet.

A CRID URL takes the form

crid://<DNS name>/<data>

The aim of this spec is to document our use of rabe.ch for the "DNS name" part. It SHALL also define the "data" part in a normative way and a registry for the parts of "data" that benefit from further specification. The spec will be managed in a fashion similar to RaBe CloudEvents.

This spec SHALL allow us to build a CRID authority as defined in ETSI TS 102 822-4 and according to the XSD published in TS 102 822-3-1.

Given that we don't implement low-level DAB/DVB transports or PDRs on our own, the main focus on our usage or CRIDs and resolving will be focused on bi-directional (tcp/ip based) CRID resolution.

The autority will be discoverable use DNS SRV records: _lres._tcp.rabe.ch. Summarised from ETSI TS 102 822-3-1 the authority does the following:

The definition of data is expected to match the elements our CRID authority supports with the aims of making resolving as seamless as possible.

At the time of writing i have not been able to find an open source implementation of a CRID authority (nor a proprietary one for that matter). Any existing implementations should likely govern the definition of this spec.

Further information is potentially in ETSI TS 102 822-6-2 which defines UDDI and WSDL. While these infos are specific to using SOAP, chances are high that we would prefer not to do so (rather we'd stick with embedding metadata in the authority response described above).

ETSI TS 102 822-8 describes how classic smart tv/dvb CRID/metadata can be interchanged to be acessible from the web. In our web-first case we will expose the underlying data directly w/o catering to any kind of smart tv/radio devices (this would be the responsability of any low-level service provider from our pov).

Tasks

hairmare commented 2 years ago

find a sensible way to implement a CRID authority/resolver

When we start using CRID, it is very unlikely that we will be implementing a proper CRID authority as per ETSI TS 102 822-3-1 .

For one, no opensource implementations of such an authority seem to be readily available.

The spec also does not align well with us wanting to stop using XML and switching to more modern standards like some form of JSON (see https://github.com/radiorabe/nowplaying/issues/128 for replacing the current XML output, more issues for other legacy xml formats exist as well).

Further searching on github for anything related has turned up https://github.com/tvheadend/tvheadend/pull/315 which indicates that there is some merit in using a CRID based system without providing an "proper" CRID authority.

All in all, the decision to not keep implementing a CRID authority in spec for now will result in the first iteration of our cird-spec having to be focused on the concept of originator defined content as the baseline of our crid spec. In essence would mean that will be namespacing the data contents in a way that a decentralized authority/algorithm can be used to generate PI for CRIDs specific to a show or track.

An alternative worth exploring might be to consider implementing a RESTful alternative to the authority standard based on JSON (and possibly with lookup support). This isn't very attractive given that we don't want to have such code in scope for now (and if possibly never at all).

For now i'll try to look into possibilities and risks of going down the originator defined route and circle back to this if i'm either bored enough to write a oss crid authority or/and the originator defined approach doesn't end up being feasible.

hairmare commented 2 years ago

decide if data part of crid starts with versioning info (ie. crid://rabe.ch/<version>/<data_content> where version = v1, v1alpha, v1beta etc)

this one is easy, we will be using k8s api version style versions in the data parts of the crid and stick to the examples given:

I don't expect us to define all the versions, we'll want to start off with v1alpha and introduce v1 as soon as we have something working that we can rely on.

hairmare commented 2 years ago

I've been looking at other implementations, specifically PEACH and BBC PIPs, to see if i can find any concrete, radio oriented implementations of CRID to no avail. I still need to investigate odr-radioepg-bridge and it's dependencies to see if i can find additional info.

edit: even finding a reference to crids in python-hybridspi feels like a win at this point. it doesn't help wrt how to originate crids though.

hairmare commented 2 years ago

Example CRIDs

These example CRIDs are base on some real world examples. Currently parts of our automation already uses human-readable show to match things. I'm proposing we stick with that and use human readable strings going forward.

Show Description Web URL CRID
Der Morgen (Freitag) morning show, is a different show for each day of the week, does not have repeats and no individual episodes on the website https://rabe.ch/der-morgen-freitag/ crid://rabe.ch/v1/der-morgen-freitag
RaBe Info news, different on each day of the week, gets repeated once on air and published as podcast, has a page per episode on the web site https://rabe.ch/info/ crid://rabe.ch/v1/info
Klangbecken Always on show, gets used as a filler if nothing is scheduled and has it's own schedule, no repeats, no episodes https://rabe.ch/klangbecken/ crid://rabe.ch/v1/klangbecken

Most of our shows are somewhere between how Info and how Der Morgen (Freitag) works. So in some cases there is a concept of episodes, in other cases there is not (the monring shows are really that always happening don't really have eps). This is reflected on the website where Info will have a page per ep as well as their show page while Der Morgen only has a show page. From the website pov shows can switch freely between both of these modes (hence the linked morgen having some old eps from 2018).

From a crid standpoint it makes sense to simplifty things. For a start we won't be defining CRIDs for episodes but that just use the shows crid instead (with optional time info in a local part using #). Should we decide to encode more infomation into crids at a later stage we can either define v2 or stick ep info onto the path (ie./v1/info/2022-01-21).

hairmare commented 2 years ago

abnf:

crid          =   "crid://rabe.ch/" version "/" data-content
version       =   "v" 1*DIGIT [ pre-release ]          ; ie. v1, v2,
pre-release   =   ( "alpha" / "beta" / "rc" ) *DIGIT   ; v1alpha, v1alpha1, v1beta, ...

data-content  =   show-name [ "#" media-frags ]
show-name     =   1*ALPHA     ; show name string derived from website
media-frags   =   utc-range   ; based on https://www.w3.org/TR/media-frags/

utc-range     =   "t=clock:" utc-date-time "-" [ utc-date-time ]
utc-date-time =   utc-date "T" utc-time "Z"
utc-date      =   8DIGIT                    ; < YYYYMMDD >
utc-time      =   6DIGIT [ "." fraction ]   ; < HHMMSS.fraction >
fraction      =   1*2DIGIT                  ; 0-99
hairmare commented 2 years ago

The media-frags bit needs some more elaboration. We want to use real time world clock timestamps to ensure that the URL are specific to add specific date and time on the gregorian calendar. The current w3c recommendation tells us that Temporal clipping is denoted by the name t

Temporal clipping is specified as Normal Play Time (npt) RFC 2326. It can also be specified as SMPTE timecodes SMPTE or as real-world clock time (clock) RFC 2326 in the advanced version described in the Media Fragments 1.0 URI (advanced) document. Begin and end times are always specified in the same format. The format is specified by name, followed by a colon (:), with npt: being the default. In this version of the media fragments specification there is no extensibility mechanism to add time format specifiers.

We want to use RFC 2326, in the section "10.5 PLAY" (page 34) it has this bit:

For playing back a recording of a live presentation, it may be desirable to use clock units:

C->S: PLAY rtsp://audio.example.com/meeting.en RTSP/1.0
CSeq: 835
Session: 12345678
Range: clock=19961108T142300Z-19961108T143520Z

In the example a user agent is querying a media server and using a media fragment as part of the range request. This isn't our use-case but we can piggy back on the definition given for using clock as the time format specifier. RFC 2326 specifies this as utc-range in ebnf (page 17, 3.7 Absolute Time):

utc-range    =   "clock" "=" utc-time "-" [ utc-time ]
utc-time     =   utc-date "T" utc-time "Z"
utc-date     =   8DIGIT                    ; < YYYYMMDD >
utc-time     =   6DIGIT [ "." fraction ]   ; < HHMMSS.fraction >

Example for November 8, 1996 at 14h37 and 20 and a quarter seconds UTC:

19961108T143720.25Z

Given this our CRID for a specific point (like the 8. Nov '96) in Klangbeckens history would look like this (we use an adapted utc-range to use : since we are using the format in a the fragment part:

crid://rabe.ch/v1/klangbecken#t=clock:19961108T143720.25Z

This allows referencing a single temporal point of rabe (we reference the actual "physical" broadcast here, not some http based representation). To point to a time range (say to encode duration info for todays Info) we can add an end time:

crid://rabe.ch/v1/info#t=clock:20220209T120000Z-20220209T123000Z

The end time is optional because it's not important for a lot of use-cases (ie. if we use crids as id for RaBe CloudEvents then we really just want them to distinctivly reference the moment the event is about for uniqueness sake)

There is also some potential in finding/creating a helper lib to help reason about the media fragments. Such a lib could help converting from world clock to other formats like duration in seconds or laguage native representations. it could also implement comparison operators to help reason about crids answering questions like "is this crid nested in another crid?" or "is this show still on air right now?")

hairmare commented 1 year ago

Fixed in 38fc72cf2c30ea6031759583c662f55c52331f66