Podcastindex-org / podcast-namespace

A wholistic rss namespace for podcasting
Creative Commons Zero v1.0 Universal
389 stars 116 forks source link

Proposal: Live System Architecture #212

Open Muppet1856 opened 3 years ago

Muppet1856 commented 3 years ago

Architecture Proposal

In order to settle on a tag, its important to settle on the usage of the system as a whole.

Conventions

To help with the example I use the following terms and conventions: bold = subsystem

Terms

Subsystem: creator publisher advertiser provider endpoint
Description: Podcaster Podcasting Host Index Application Consumer Device
Examples: No Agenda, Daily Tech News Podbean, Self Hosted, Blubrry, BuzzSprout, PodBean, Youtube, Facebook Live iTunes, Podcast Index Hypercatcher, PlayaPod, Podfriend iOS app, Google app, GrapheneOS app, Website

Considerations

Proposed Use Case - A publisher model with centralized aggregation and decentralized propagation of notification

image

Note: The use of RTMP was chosen to provide a specific mechanism, but could be replaced with others interchangeably

  1. Publishing the podcast as supported
    1. The creator indicates as a show
      • This is a publisher interface, maybe a checkbox? Its basically there to prompt the publisher to make sure there is a tag and it gets populated.
    2. The publisher adds tag to XML with stream source, stream notifier subscription address
      • This is the area of open question. It may be simple enough to have the stream address be fully qualified with the protocol and that would allow the standard to evolve with protocols.
      • My narrow view is RTMP, but there may be a better choice. I suppose the address could be an http(s) which could point a user to the big-tech options or a publisher based WebRTC.
    3. The advertiser updates the Index with the stream address, and the publisher subscription address
      • Would be an address that accepts POST whose payload includes the provider webhook for start/stop notification.
  2. Subscribing to a show
    • provider enrolls in stream notification from publisher by posting the provider webhooks address to the publisher stream subscription address
  3. Going Live with podcast
    1. The creator pushes stream to publisher
      • I envision this utilizing RTMP or RTMPS - one advantage here is the built in functionality of RTMP to fire events on stream start, such as a routine to notify the downstream consumers
    2. The creator announces stream to publisher
      • With RTMP(S), this can be as simple as pushing the stream to the publisher, or could require a user to push a "go live" button on their panel.
    3. The publisher announces stream to provider
      • webhooks...
    4. The provider announces stream to endpoint
    5. The endpoint accesses the stream utilizing the publisher stream address

Proposed Enclosures

<podcast:live>
   <item>
      <title>Podcasting 2.0 Live Show</title>
      <description>&lt;p&gt;A look into the future of podcasting and how we get to Podcasting 2.0!&lt;/p&gt;</description>
      <link>https://example.com/podcast/live</link>
      <guid isPermaLink="true">https://example.com/live</guid>
      <author>John Doe (john@example.com)</author>
      <podcast:scheduledstart>Fri, 09 Oct 2020 04:30:00 GMT</live:scheduledstart>
      <podcast:scheduledend>Fri, 09 Oct 2020 07:30:00 GMT</live:scheduledend>
      <itunes:image>https://example.com/ep0003/artMd.jpg</itunes:image>
      <podcast:images srcset="https://example.com/images/ep3/pci_avatar-massive.jpg 1500w,
         https://example.com/images/ep3/pci_avatar-middle.jpg 600w,
         https://example.com/images/ep3/pci_avatar-small.jpg 300w,
         https://example.com/images/ep3/pci_avatar-tiny.jpg 150w" />
      <itunes:explicit>no</itunes:explicit>
      <podcast:season name="Podcasting 2.0">1</podcast:season>
      <podcast:episode>3</podcast:episode>
      <podcast:person href="https://www.podchaser.com/creators/adam-curry-107ZzmWE5f" img="https://example.com/images/adamcurry.jpg">Adam Curry</podcast:person>
      <podcast:person role="guest" href="https://github.com/daveajones/" img="https://example.com/images/davejones.jpg">Dave Jones</podcast:person>
      <podcast:person group="visuals" role="cover art designer" href="https://example.com/artist/beckysmith">Becky Smith</podcast:person>
      <enclosure url="rtmps://live.example.com/live.mp3" type="audio/mpeg"/>
      <podcast:subscribe url="https://live.example.com/subscribe"/>
   </item>
</podcast:live>

Concerns

<podcast:scheduledstart> could be a discrete time, or a pattern? Having someone update their XML and hope that it propagates to their subscribers in time is worrisome. <podcast:scheduledstop> could be a disaster. I don't even know if anyone would care... <title> & <podcast:episode> are either redundant or useless. I'm not sure they are helpful in the context. The <channel> text is probably the most helpful here and <podcast:episode> suffers from the same issue as <scheduledstart>.

daveajones commented 3 years ago

Thank you for putting in the work for such a thorough writeup. Very helpful.

agates commented 3 years ago

Hmm, I've seen various live tags proposed in several different places now... this aligns most closely with what I've had in mind, and I'm going to give a shot at an opinionated implementation for Peertube.

A few thoughts.

I understand some people just want one stream link and a recurring schedule, but we all know the schedules never match up with reality except for perhaps stricter business use cases. I think we can optionally allow for a "planned" schedule such as an ICS file, but I think this would be most useful for 1. feed refresh hints 2. UX display in a sort of calendar setting (when is this stream happening next)

WIth ISO 8601 durations we can include the planned start and end time into one tag. The benefit of this is it doesn't strictly require an end time, like you alluded to, but it can be included. I can see a use case both with and without and I think websub should be a strict requirement around this feature for it to function correctly regardless, where the RSS feed host either removes the live item or adds an end after the stream ends. There may be a need here for Podcast Index to offer a basic live notification API for applications that don't host their own servers.

RTMP is fine but it's also the most difficult to support especially on the web. I think RTMP, HLS, and DASH should all be options and I foresee HLS initially being the most common due to existing web technologies. All three have their pros/cons that I can see various different providers being interested in supporting. I think peertube does a weird RTMP->HLS thing both to support existing streaming software (OBS for example) but to get the benefits of HLS+peer-to-peer.

I know even less about the audio-only streaming side so I can't really speak to that. I'm going to go ahead and plug alternateEnclosure here for those various benefits. Maybe there's also opportunity to provide embedded players instead of having every client bring their own? That could be disastrous for UX, so I'm not sure.

Lastly, why not <podcast:liveItem>? I can see the point about just including <item> under a live tag, however I think it's important for this to semantically serve a different purpose and it will also contain tags that <item> would not. Yes there is some duplication, but it's also a completely new thing (and let's be honest, this isn't a normalized schema).

Anyway, with the right implementation here I can see this so close to becoming both youtube live/twitch alternatives with all of our own features built on top. Kind of curious to see if the Floatplane guys would have any input also and be willing to collaborate, since they represent a pretty strong community as a realistic youtube-alternative.

Muppet1856 commented 3 years ago

I understand some people just want one stream link and a recurring schedule, but we all know the schedules never match up with reality except for perhaps stricter business use cases. I think we can optionally allow for a "planned" schedule such as an ICS file, but I think this would be most useful for 1. feed refresh hints 2. UX display in a sort of calendar setting (when is this stream happening next)

I had thought of this as an optional tag, for people who intend on keeping a schedule, or at least attempting to keep one. One of the things I had always heard about the success of a podcast is to have a show that maintains some sort of schedule. The granularity may be the most suspect thing. Many shows offer a "daily, weekdays", "weekly", "twice-a-week", with no guarantee on time of day. This is probably the most difficult part of setting a standard for schedule.

WIth ISO 8601 durations we can include the planned start and end time into one tag. The benefit of this is it doesn't strictly require an end time, like you alluded to, but it can be included.

Existing standards are good, and for explicit dates I think ISO8601 is a possible solution, but it does not lend itself to posting a schedule... Thursdays and Sundays at 12PM Eastern or Weekdays at 3, at least not from what I can garner from reading the wiki.

I can see a use case both with and without and I think websub should be a strict requirement around this feature for it to function correctly regardless, where the RSS feed host either removes the live item or adds an end after the stream ends. There may be a need here for Podcast Index to offer a basic live notification API for applications that don't host their own servers.

I was unaware of websub when I wrote it down. After reading up on it, it looks like I am endorsing the same. I would like to hear hosting providers opinion on this point, as they would bear the burden of running it. That said, there are a lot of implementations that allow for this to be leveraged... (https://github.com/w3c/websub/tree/master/implementation-reports)

RTMP is fine but it's also the most difficult to support especially on the web. I think RTMP, HLS, and DASH should all be options and I foresee HLS initially being the most common due to existing web technologies. All three have their pros/cons that I can see various different providers being interested in supporting. I think peertube does a weird RTMP->HLS thing both to support existing streaming software (OBS for example) but to get the benefits of HLS+peer-to-peer.

I'm not clear on how difficult this is to stand up. I did it in just a few minutes using nginx and the RTMP extension on a RaspberryPi 3 and putting one up as a publisher should be trivial as well. I don't know how it scales in terms of compute resources... This doesn't feel like a limiting aspect of a <live> tag, and should probably be left to folks who might have to provide these resources to optimize. FYI - as of today, this is what each of the "major" streaming services support for ingest.

Provider RTMP RTMPS HLS DASH
YouTube x x x x
Twitch x - - -
Facebook Live - x - Beta
PeerTube x - - -

I know even less about the audio-only streaming side so I can't really speak to that. I'm going to go ahead and plug alternateEnclosure here for those various benefits. Maybe there's also opportunity to provide embedded players instead of having every client bring their own? That could be disastrous for UX, so I'm not sure.

I'd like to see what @daveajones has to say here as NA does a live audio stream, but I think they are using a basic html5 <video> element with only an audio source... Not sure what the ingest looks like.

Lastly, why not <podcast:liveItem>? I can see the point about just including <item> under a live tag, however I think it's important for this to semantically serve a different purpose and it will also contain tags that <item> would not. Yes there is some duplication, but it's also a completely new thing (and let's be honest, this isn't a normalized schema).

I don't know how the various feed aggregators would handle any of this. Does Apple just ignore any tags and their children that it doesn't support? Frankly, I wrote something here to start this conversation...

Anyway, with the right implementation here I can see this so close to becoming both youtube live/twitch alternatives with all of our own features built on top. Kind of curious to see if the Floatplane guys would have any input also and be willing to collaborate, since they represent a pretty strong community as a realistic youtube-alternative.

My goal with this description and of the <live> tag is not to replace the existing live-streaming infrastructure, but to augment it. Hopefully, with adoption, people will see the light and take control of their content, leveraging a <value> friendly distribution method.

agates commented 3 years ago

Existing standards are good, and for explicit dates I think ISO8601 is a possible solution, but it does not lend itself to posting a schedule... Thursdays and Sundays at 12PM Eastern or Weekdays at 3, at least not from what I can garner from reading the wiki.

A schedule of when a stream plans to start and the actual start are two very different things. The schedule is ultimately meaningless in the face of when something goes live and trying to predict it will fail, you can only provide a suggestion.

To represent the schedule itself, I think ICS is perfect here. It's widespread and a very well known standard. This could just be a URL with a last modified timestamp.

I am all for a schedule.

However, I'm separately suggesting the "start"/"end" (duration) be the time when the stream has started/ended and applications can display it as an active live stream or know when its over. The live item could even disappear completely if the stream doesn't persist as something to watch after it's over.

This gets us the distance between the client application merely guessing when something might be live to having a reasonable amount of confidence that it's live so it can display it and notify the user. It really depends on websub + application notifications however.

daveajones commented 3 years ago

This tag deserves a phase all to itself I think. It’s a big one and we need to get it right.

keunes commented 3 years ago

(Regarding schedule, #154 might be related/relevant.)

Muppet1856 commented 3 years ago

A schedule of when a stream plans to start and the actual start are two very different things. The schedule is ultimately meaningless in the face of when something goes live and trying to predict it will fail, you can only provide a suggestion.

This was my intent - that a suggestion could be part of the feed so an app could surface it to the end user. Like a reminder in the morning.

To represent the schedule itself, I think ICS is perfect here. It's widespread and a very well known standard. This could just be a URL with a last modified timestamp.

No disagreement there.

However, I'm separately suggesting the "start"/"end" (duration) be the time when the stream has started/ended and applications can display it as an active live stream or know when its over. The live item could even disappear completely if the stream doesn't persist as something to watch after it's over.

I'm assuming that this might just be "solved" because a live episode that comes back as a "produced" episode would automatically update the feed with the permanent version and a temporal live episode with no watch later wouldn't exist.

I'm also guessing this particular comment has to do with nuances of peertube and I don't yet understand the challenges there.

This gets us the distance between the client application merely guessing when something might be live to having a reasonable amount of confidence that it's live so it can display it and notify the user. It really depends on websub + application notifications however.

I see this is as a "morning reminder" that my show is scheduled to be live at 12:00 (local). PodPing would do the actual announcement.

saerdnaer commented 3 years ago

Whats the reason to wrap the item into <podcast:live>? Every item which does not have an enclosure tag is normally ignored by regular podcast apps, so we could simply add a new item which only contains the livestream urls e.g. via the <alternateEnclosure> tags...

daveajones commented 3 years ago

I feel like this was discussed but I don’t remember why the decision was made.

@agates @Muppet1856 ?

agates commented 3 years ago

I don't think live stream URLs have to live within alternateEnclosure. The two tags are pretty independent now.

Regardless, I need to get a usable test implementation out before saying I have a preference either way. My gut feeling is it should just be podcast:liveItem.

agates commented 3 years ago

A schedule of when a stream plans to start and the actual start are two very different things. The schedule is ultimately meaningless in the face of when something goes live and trying to predict it will fail, you can only provide a suggestion.

This was my intent - that a suggestion could be part of the feed so an app could surface it to the end user. Like a reminder in the morning.

To represent the schedule itself, I think ICS is perfect here. It's widespread and a very well known standard. This could just be a URL with a last modified timestamp.

No disagreement there.

However, I'm separately suggesting the "start"/"end" (duration) be the time when the stream has started/ended and applications can display it as an active live stream or know when its over. The live item could even disappear completely if the stream doesn't persist as something to watch after it's over.

I'm assuming that this might just be "solved" because a live episode that comes back as a "produced" episode would automatically update the feed with the permanent version and a temporal live episode with no watch later wouldn't exist.

I'm also guessing this particular comment has to do with nuances of peertube and I don't yet understand the challenges there.

Well my intention is to provide enough information to keep the application from having to assume anything. If we provide a way to explicitly tell the application what happened, no assumptions have to be made (ignoring the issue with bad data but we will never prevent that).

This gets us the distance between the client application merely guessing when something might be live to having a reasonable amount of confidence that it's live so it can display it and notify the user. It really depends on websub + application notifications however.

I see this is as a "morning reminder" that my show is scheduled to be live at 12:00 (local). PodPing would do the actual announcement.

Yeah, PodPing wasn't a thing when I made this comment :). However schedules do not indicate liveness. They can and will be wrong.

Muppet1856 commented 3 years ago

@agates, I feel like we've both come to the conclusion that podping is the obvious solution to notification of live.

Maybe I should rename it from to or ?

daveajones commented 3 years ago

Reminder here to use ISO8601 date format for live start/end.

Muppet1856 commented 3 years ago

Reminder here to use ISO8601 date format for live start/end.

I had hoped we could have a standard for the start/end that would allow for a "set once" type of scheduling, such-as "every Monday and Wednesday at 11".

It could be accomplished using an iCalendar entry (RFC6321 provides a schema for this in XML), though it may make the file larger, it would help the pod-caster more easily update his audience. It adds all of the repeating functionality without having to make the author update their plan each week in advance of their show.

Caveat: I don't think RFC6321 aka xCal, is an easy format to understand, but it is - in my opinion - the "right" way.

@agates @daveajones

jamescridland commented 3 years ago

As a comment from experience - an automatic "every Monday and Wednesday at 11am" setting leads to sadness. Podcasts (and events) take vacations and sometimes disappear altogether. We don't want to make it too easy to set something like this, and pollute the space with ghost live events for podcasts that have long-since stopped recording.

The bottom of https://pod.events says:

Repeating events: you’re welcome to list these: but we’d like you to enter each event separately. This helps us ensure that the event is still valid, and encourages event organisers to put specific details of each event (like speaker names), so it’ll work better for everyone.

While there's nothing to stop someone from producing an RSS feed with repeating events in this use-case, I would strongly recommend not codifying a method to repeat events.

agates commented 3 years ago

@Muppet1856 @daveajones

I'm inclined to agree and I think the start/end attributes are best described this way:

You can set the start time ahead of the start of the stream to indicate a scheduled start with status="pending". However it's not required and should be updated with the actual start time of the stream according to the host when status="live".

Having a separate schedule, be it RFC6321, another set of "schedule" timestamps, or otherwise, helps mitigate this confusion. iCal helps with single item events/recurrences and could be set one per liveItem, and we could potentially offer something like webcal at the channel level. As clunky as these standards are, they fit into the existing calendar ecosystem for a good reason (they work).

This is really important because it gives flexibility. If you don't want a repeating event, you don't have to use one. ical can support both one-time and recurring events trivially, set per-liveItem. Webcal exists to handle multiple events, recurring or not, and things like cancellations. You can think of webcal like a protocol supporting multiple iCal links.

We need to consider things outside of the podcast ecosystem. For example, there is absolutely nothing preventing someone from starting up a regularly-scheduled indie television show/network that handles live premiers. Whether or not that seems feasible to anyone is another matter, but the point is to be flexible.

Lastly, my biggest reason for making the distinction between the scheduled start and actual start is because the client having an approximate (albeit relative) starting timestamp is useful for a couple things: 1. metrics and 2. allowing the client to know when it's supposed to be able to play the stream without a failure, in the case of video streams for example that aren't on 24/7.

daveajones commented 3 years ago

I can see both sides of this. Having a schedule available makes sense. Also @jamescridland has a good point about schedule pollution.

Is the solution to just break that out into a separate <podcast:schedule> tag that can be attached to the channel and references an external file? The schedule can then be associated with the <podcast:guid> of the feed within directories and can contain all sorts of metadata about when this feed does certain things like publishes new episodes and live streams upcoming.

A schedule tag could allow for a TVGuide for podcasts service to exist. This would also greatly improve polling performance without resorting to algorithmic guesswork.

agates commented 3 years ago

@daveajones I was thinking about a separate tag also, yes. We could get away with it in another phase if needed since scheduling is complicated enough.

Muppet1856 commented 3 years ago

The only concern I have with an external file for the schedule is that it requires a separate polling to see if it updates. If it's in the rss it would be pulled with each update to the xml. While there is nothing stopping us from polling both, it is antithetical to many of the other things we've done. Maybe we can extend podping to allow a schedule update notice? This would especially be true for a live only show that wouldn't have a need to update their xml in a repeating schedule format.

@jamescridland, A dead/ghost feed doesn't get solved by limiting the functionality. A user stops subscribing to a feed if it's dead (and only If it's clogging up their lists or they're ocd like me) and this would be especially true if it were hitting them with reminders for a show that isn't happening.

For that matter, I'm not totally clear on what problem posting every planned live event schedule individually solves. In my view, it adds yet another detail that someone will screw up in their rss while updating it weekly/daily.

The one thing about using an ics I do like is that there are very easy ways to create and manage the ics. Every available scheduling software supports their creation and editing. Everything from creating a series to an occurrence, to a single occurrence edit. Not that I encourage it, but one could create a calendar in Google calendar for just the podcast schedule and share the ics quite easily. Then any would immediately propagate.

agates commented 3 years ago

I was indeed thinking about adding a "schedule" reason to podping to indicate the schedule has been updated. I think we were contemplating on doing so for other external resources as well, such as chapters.