Podcastindex-org / podcast-namespace

A wholistic rss namespace for podcasting
Creative Commons Zero v1.0 Universal
372 stars 112 forks source link

<podcast:archive> Allow for linking to a big RSS with all/more episodes #117

Open AbelLykens opened 3 years ago

AbelLykens commented 3 years ago

Thoughts:

brianoflondon commented 3 years ago

I like this idea very much. Not sure who'd implement it but it would be fantastic.

daveajones commented 3 years ago

I like this a lot too. I know some work has been done with paged feeds (Podlove?) so maybe we can synthesize things.

dellagustin commented 3 years ago

There is already an RFC for paged feeds, https://tools.ietf.org/html/rfc5005 are you familiar with it?

Known supporters:

dellagustin commented 3 years ago

Tracing back to the mastodon thread, there is some interesting info there: https://podcastindex.social/@dave/105232584034401761

daveajones commented 3 years ago

Thanks for that @dellagustin .

We need more digging to see how entrenched RFC5005 is, which I will do.

In situations like this, if there is an existing standard in wide use, and with no obvious defects, let's not touch it. If there is an existing standard with little use, let's see if we can merge with it to hopefully get more attention/adoption for it without breaking anything too bad.

If there is an existing standard with wide use, but it's not meeting people's needs and has no hope for change from it's source, I think it's ok to offer an "optional" alternative tag, but that needs much discussion and deliberation.

Let me dig on this and see if I can determine what's going on with this in the wild.

Inrumpo commented 3 years ago

I like the idea of adopting the already existing paged feeds standard.

Podlove.org has already implemented paged feeds some time ago. Pocket Casts does support paged feeds too.

eteubert commented 3 years ago

Podlove Publisher developer here 👋

Yes we've been implementing RFC5005 for a while now. Adoption from clients is sporadic. A new/different standard wouldn't help though because I'd say RFC5005 does all that's required. We need to be louder about the existence of the standard and ask for it's implementation from all sides.

Here's an example feed: https://feeds.metaebene.me/freakshow/mp3

This is the relevant excerpt from feed page 2:

<atom:link rel="next" href="https://freakshow.fm/feed/mp3?paged=3" />
<atom:link rel="prev" href="https://freakshow.fm/feed/mp3" />
<atom:link rel="first" href="https://freakshow.fm/feed/mp3" />
<atom:link rel="last" href="https://freakshow.fm/feed/mp3?paged=9" />
theDanielJLewis commented 3 years ago

But do we want podcast archive feeds to be paginated, or a single big archive?

AbelLykens commented 3 years ago

Suppose pagination in the archive is something we would end up at anyway (if archives become large). I like @eteubert 's suggestion.

Inrumpo commented 3 years ago

But do we want podcast archive feeds to be paginated, or a single big archive?

I'm not sure if there is a need for a dedicated archive feed if your regular feed is paginated and could contain all your episodes.

theDanielJLewis commented 3 years ago

Oh my bad. I was thinking the idea was to link to a separate archive, and then that archive would be paginated. But this is really about paginating the main RSS feed itself.

I like that, but that creates issues with apps that don't support that pagination but yet you want all your episodes to be searchable in the catalog. For example, Apple Podcasts will display and search up to 300 episodes in the public catalog.

Inrumpo commented 3 years ago

I like that, but that creates issues with apps that don't support that pagination but yet you want all your episodes to be searchable in the catalog. For example, Apple Podcasts will display and search up to 300 episodes in the public catalog.

I wouldn't like to see us getting stopped by "XYZ doesn't support this (yet)" but I absolutely get your point. In this case you'd have to set your page limit to 300 if you want Apple Podcasts to show the maximum amount of episodes. (What was your total feed episode limit before will now be your page limit.) For most podcasters this would result in only having one page though and kind of defeat the purpose – but it's possible.

theDanielJLewis commented 3 years ago

I wouldn't like to see us getting stopped by "XYZ doesn't support this (yet)" but I absolutely get your point.

Actually, I think the better way I should have brought this up is emphasizing the need for backwards compatibility.

I think most (if not all) our other changes add things to the feeds and thus don't break current features. But implementing this in the most optimized way would break that backwards compatibility with Apple Podcasts or any other apps that search based on what's in the feed.

So it would work great for those who already have smaller feed-item limits because this means they can would be able to make their entire catalogs available through paginated feeds. But it won't work for those who already have high item limits and were hoping to optimize without removing items.

Inrumpo commented 3 years ago

I get that. To not break anything the (default) page limit should be set to whatever the biggest searchable library allows. Let‘s say 300. The way I understand it, there is no mandatory item per page amount or limit. Paged feeds allow for users choice on how "long" he/she wants his/her pages to be.

daveajones commented 3 years ago

Podlove Publisher developer here 👋

Yes we've been implementing RFC5005 for a while now. Adoption from clients is sporadic. A new/different standard wouldn't help though because I'd say RFC5005 does all that's required. We need to be louder about the existence of the standard and ask for it's implementation from all sides.

Here's an example feed: https://feeds.metaebene.me/freakshow/mp3

This is the relevant excerpt from feed page 2:

<atom:link rel="next" href="https://freakshow.fm/feed/mp3?paged=3" />
<atom:link rel="prev" href="https://freakshow.fm/feed/mp3" />
<atom:link rel="first" href="https://freakshow.fm/feed/mp3" />
<atom:link rel="last" href="https://freakshow.fm/feed/mp3?paged=9" />

Thanks for jumping in with this information @eteubert . I agree. Don't mess up an existing standard. Is it appropriate in any way for us to bake RFC5005 info into this documentation? Perhaps just a section on proper feed archiving that lays things out in a more natural way than an RFC, but basically just says "do it the 5005 way."

I'm just thinking that this namespace is now being widely adopted. So, that gives the chance to emphasize proper use of other standards at the same time, even if they are not part of the namespace proper.

eteubert commented 3 years ago

Actually, I think the better way I should have brought this up is emphasizing the need for backwards compatibility.

It is backwards compatible: clients that don't read the RFC5005 tags can ignore them. As @Inrumpo says, in the transition period, podcasters must include a higher number of episodes on the first podcast page -- just as they do now. What we should think about though is a recommended number of episodes on the first feed page for the hypothetical state of good-enough adoption of the standard. My gut says 10 but I can't really put solid arguments behind it. Maybe even 5 would be enough eventually.

Furthermore, creating a new standard wouldn't be more or less backwards compatible. It's only really useful once adoption is high, in either case. Until then, the main podcast feed (/ first page) needs to contain all episodes.

Is it appropriate in any way for us to bake RFC5005 info into this documentation? Perhaps just a section on proper feed archiving that lays things out in a more natural way than an RFC, but basically just says "do it the 5005 way."

I implement specifically section 3: "Paged Feeds" of the spec. Nothing else. So only a subset, the rest can (should?) be ignored for our purposes.

Section 2 is for marking feeds as complete, which we don't need because there's already a tag for that. And section 4 is for having multiple archives, which isn't usefull for us either.

I suggest to specifically reference section 3 of RFC5005 and provide a succinct example. And/or copy the section verbatim as well, as it's literally just four bullet points and two sentences:

The feed documents in a paged feed are tied together with the following link relations:

o "first" - A URI that refers to the furthest preceding document in a series of documents.

o "last" - A URI that refers to the furthest following document in a series of documents.

o "previous" - A URI that refers to the immediately preceding document in a series of documents.

o "next" - A URI that refers to the immediately following document in a series of documents.

Paged feed documents MUST have at least one of these link relations present, and should contain as many as practical and applicable.

Inrumpo commented 3 years ago

I did some more research on the adoption of the already existing paged feeds standard. Podcast Addict told me that

the app has been supporting the standard Atom /RSS page feeds for the past 8 years.

Podcast Addict does claim that

Podcast Addict is the # 1 Podcast App on Android with over 10M downloads, 500K reviews, 2 Billion episodes downloaded and an average rating of 4.7/5.

This might allow for some widespread use from day one.

daveajones commented 3 years ago

Thanks for this. If we aren’t going to do anything new, but instead just encourage adoption, how should that happen? Do we want to look at writing a recommendation document that shows the issues around why each tag exists and the problem each solves - then bring in others issues like this one where we can give non-namespace guidance?

jamescridland commented 3 years ago

I've been trying to get a "best practices" group together for some time.

For RSS, it could publish a proper, humanly-readable specification for best-practice RSS generation. Including paged feeds, correct use of the <lang> and <ttl> attributes, and helpful guides including "here's how best to use <description> and <content:encoded> together", as well as paged feeds.

And, yes, the new tags, too.

It would be intended for this to be readable for both podcast host developers (the people who make the RSS), and for podcast RSS clients (the app developers).

It strikes me that this might be a Github-hosted website and all the documents be written and collaborated upon online as Markdown docs. I rather enjoy writing or editing technical stuff to keep it clear for regular human beings.

Personally, this sounds like its own project. PodcastIndex has its own political aims of non-censorship for any reason (which some don't fully agree with); OPAWG is seen by some as being against IAB guidelines. Perhaps it helps having a more neutral group as a different approach?

Inrumpo commented 3 years ago

Thanks for this. If we aren’t going to do anything new, but instead just encourage adoption, how should that happen? Do we want to look at writing a recommendation document that shows the issues around why each tag exists and the problem each solves - then bring in others issues like this one where we can give non-namespace guidance?

I like your idea. I would point out that we consider paged feeds part of our – generally speaking – specification (~ "tag level importance") and not just another incidental consideration, even though paged feeds are not our invention. An exemplary structure to illustrate my thought:

Documentation

I like @jamescridland's approach to make things readable for people making the RSS and client developers.

Paged feeds are not something the individual podcaster needs to care about other than maybe setting an episode limit in his/her RSS creation tool (e.g. podcast host UI, WordPress plugin UI).


Personally, this sounds like its own project.

I agree – a good one. A modern, clean "best practices example RSS feed" (maybe with some explanatory side notes) can be a helpful tool. I just don't know where to place it. I can imagine this in a somewhat "authoritarian"/prominent place rather than someone's personal blog. Why shouldn't the Podcastindex provide an example of a full-featured RSS feed?

daveajones commented 3 years ago

One thing I've been considering as part of the PodcastIndex work is validating feeds and surfacing that in an automated way. Still thinking that out. I'd gladly use a best practices document such as this as the guiding document for that service.

PofMagicfingers commented 3 years ago

On a directory pov, what would be a simple way to detect if an episode on distant pages has been modified or deleted. Should we recommande the usage of  If-Modified-Since header?

jamescridland commented 3 years ago

@Inrumpo The Podcast Academy seems to be waking up to the benefit of being a best-practices organisation; so perhaps they might be long-term interested in it.

I own podinfra.net and would be very happy to make that a best practices Github website to detail how to write stuff for podcast infrastructure. Absolutely it can be associated to Podcast Index, but I think it needs to look further than just the podcast namespace.