PRX / feeder.prx.org

Dovetail podcast content management system
https://podcasts.dovetail.prx.org
GNU Affero General Public License v3.0
5 stars 0 forks source link

Problems with `<itunes:summary\>`, `<content:encoded\>`, `<description\>` and apple podcasts preview pages #1030

Open kookster opened 2 months ago

kookster commented 2 months ago

Apple renders the podcast preview pages using a different set of rules for the description text/html than in the apps.

The native apps now seem to only look at <description/>, and allow some html including <p> tags for new lines. https://help.apple.com/itc/podcasts_connect/#/itcb54353390

It says just these tags, but it also seems to respect <br> tags:

description is text containing one or more sentences describing your episode to potential listeners. You can specify up to 4000 bytes. You can use rich text formatting and some HTML (<p>, <ol>, <ul>, <li>, <a>) if wrapped in the <CDATA> tag.

We also have fields in the episodes db table for episodes.content (<content:encoded \>) and episodes.summary (<itunes:summary \>), but I don't think the new feeder UI sets those. However, we have old values in there, from when publish would set them and they still come in on rss import.

I also see in the ui, that there are hidden fields that will set episode content and summary to nil whenever an episode is saved. This seems to indicate we want to move away from those, and it will cause any <content:encoded /> to stop being included once that is set to nil, but the rss builder will still set an <itunes:summary/> based on a sanitized description.

This further is an issue because the sanitization procedure replaces <br\> and <p\> tags but doesn't convert them to spaces or newlines, so it causes words to get concatenated that were previously separated by the whitespace form those tags.

Further complicating matters, some episodes have a summary set that is the same as the subtitle, which then gets used instead of the sanitized description (e.g. https://podcasts.dovetail.prx.org/episodes/b62bf326-cc87-4a83-a4ed-a209c5d84e62/edit).

So there are a few things we could do -

  1. get rid of displaying <content:encoded> and <itunes:summary> entirely - and just rely on <description>.

  2. add back <content:encoded> but have it always identical to description, this will then take precedence over summary if that value remains

  3. if we keep summary, we need to add content encoded, and we need to fix the white space issue with the sanitizer

kookster commented 2 months ago

I set the episode.content field on the episodes where it was displaying the badly formed <itunes:summary/>, and the apple preview page has so far not used this <content:encoded \> instead of the summary, even though it seemed to prefer it on other episodes like this episode of Today, Explained https://podcasts.apple.com/us/podcast/chasing-the-storm/id1346207297?i=1000657206346