Podcastindex-org / podcast-namespace

A wholistic rss namespace for podcasting
Creative Commons Zero v1.0 Universal
382 stars 115 forks source link

Proposal - <podcast:txt> tag for freeform content #395

Closed daveajones closed 1 year ago

daveajones commented 2 years ago

After discussions with @tedhosmann about owner verification within feeds, the need for a tag that can hold free form content was brought up. The initial use case for this tag would be so hosting platforms can provide a way for their feed owners to input a random string into the feed that other platforms will then use to confirm they are the feed owner. The concept is based on the TXT record in DNS. To make it truly flexible, it can exist in the channel or in items and there can be more than one.

Txt

<podcast:txt>

Holds free form text up to 4000 characters in length. Any valid XML character can be used here.

Parent

  <channel> or <item>

Count

  Multiple

Node value

  This is a free form string from the podcast creator. Please do not exceed 4000 characters for the node value or it may be truncated by aggregators.

Attributes

Examples

<podcast:txt>naj3eEZaWVVY9a38uhX8FekACyhtqP4JN</podcast:txt>
<podcast:txt purpose="Apple Podcasts">S6lpp-7ZCn8-dZfGc-OoyaG</podcast:txt>
<podcast:txt purpose="Drop Date">2022-10-26T04:45:30.742Z</podcast:txt>
daveajones commented 2 years ago

This relates to #394 and #356.

daveajones commented 2 years ago

Also #342

tedhosmann commented 2 years ago

In #356 there was a really good suggestion that the "service" attribute might be used so you could differentiate when you have multiple tags who should be looking for the text.

theDanielJLewis commented 2 years ago

A text tag in order to verify ownership … isn't that what the verify tag is supposed to be for?

tedhosmann commented 2 years ago

A tag specific to verify is good for verification but an open text tag can be used for verification and anything else you want to use it for in the future. Just like DNS.

daveajones commented 2 years ago

Yeah, the world never envisioned use cases like SPF records. But having a generic TXT record allowed those things to come to pass. Very valuable imo.

daveajones commented 2 years ago

In #356 there was a really good suggestion that the "service" attribute might be used so you could differentiate when you have multiple tags who should be looking for the text.

10-4. Makes sense.

daveajones commented 2 years ago

Added service attribute. Also removed the "alphanumeric" requirement so the node can hold any valid XML string.

mijustin commented 2 years ago

I dig this variation. I'm for it! 🙌

Let's ship this so we can start using it!

DaveHamilton commented 2 years ago

Brilliant. As soon as I saw the initial description I was immediately reminded of TXT in DNS, which you obviously then referenced...and that has proved immensely valuable due to its flexibility. I know feature-creep is always something to be wary of when proposing additions, but I think this one will actually mitigate quite a bit of that because of its fundamentally flexible nature.

theDanielJLewis commented 2 years ago

Since we're dealing with a podcast feed here and not domains, I suggest we consider a name other than "text." Maybe "metadata" or "attribute" or something like that.

The reason I think "text" might not be good is that it could conflict and cause confusion with other text fields we'll want in the future for actual text.

jamescridland commented 2 years ago

I agree with @theDanielJLewis

daveajones commented 2 years ago

What about <podcast:meta> ?

theDanielJLewis commented 2 years ago

I like “meta.”

daveajones commented 2 years ago

We have a winner then! 😊

tedhosmann commented 2 years ago

Maybe explore other options and hear more feedback? Not gonna say I have good alts but I have a few to propose for feedback. <podcast:open> <podcast:free> <podcast:freedom> <podcast:mod> <podcast:challenge> <podcast:claim> <podcast:abstract>

tedhosmann commented 2 years ago

It’s hard ti quickly dismiss that the inspiration and immediate reaction others have had to the “like DNS” similarity that text implies. Maybe even try on txt to make it more DNS like.

jamescridland commented 2 years ago

@tedhosmann I know, but I do always try to think about this from the user point of view. I see @theDanielJLewis ’s point.

I’d be good with podcast:txt which I think is just about removed from “text” ie “description”. If that adequately communicates that it’s just like the DNS TXT field, then that’s good. (Though, the DNS TXT field is normally a permanent thing - you add a code in here and it remains there forever. This doesn’t appear to be.)

I’m also good with podcast:meta if that’s also agreed on.

daveajones commented 2 years ago

Remember your scriptures - Rules For Standards Makers book of Winer, chapter 3 states:

“I've witnessed long debates over which name is better than another.

It totally doesn't matter what we call it. We can learn to use anything. There are more important things to spend time on.

Think of people whose first language isn't English. To them the names we choose are symbols, they don't connote anything.”

The reason I never get hung up on names is because no user will ever see this (on purpose). XML is for developers - and we already know what these things mean. And, as developers, we aren’t supposed to assume we know based on names. We are supposed to go read documentation. PHP is used on more than half of the internet and its naming of things is awful. Ditto HTML.

That being said, my preference is ‘txt’ or ‘text’ because it most closely resembles the DNS inspiration. But, I’d like to just let @tedhosmann name it since he gave birth to it. I’m cool with whatever is decided.

brianoflondon commented 2 years ago

Can we do anything other than podcast:meta even podcast:metadata? Not only does "meta" mean "She's Dead" in Hebrew, it is also the current name of a very big Web 2.0 company that I'm helping to sue right now.

gagglepod commented 2 years ago

Going back to the top of this thread, the goal is a "free form" text field that can hold free-form content. What about using the tag... wait for it... wait for it... podcast:freeform or podcast:freetext?

tedhosmann commented 2 years ago

I appreciate the opportunity to hear feedback and reactions to the proposals. I'd like to stick with simplicity here and recommend we use <podcast:txt> as this free input tag. As a reminder for documentation and hosting providers that implement the tag - this is an open and free text field to be used by you...your customers should not be seeing this as the Podcast Text field. Give them direction on what you want to use it for and make it clear in the interface that you are performing a task that might end up using this tag as the implementation.

Do you need a way to verify ownership with a challenge/response? Use the podcast:txt tag but your UI should say something about verification code. Do you need to pass along an email address to another system? Use the podcast:txt tag but your UI should say something about linking your email with the other service.

Thanks again to @daveajones for posting this and sharing a vision of a simple and open tag that can solve so many simple short-term issues.

mijustin commented 2 years ago

I'd like to stick with simplicity here and recommend we use as this free input tag

I agree. Let's keep this simple. This is primarily for hosting providers (customers really won't see it).

Transistor will support this as soon as it's available! 👍

jamescridland commented 2 years ago

Very happy with podcast:txt - it's less confusing than text and has all the goodness! Thanks, Ted.

daveajones commented 2 years ago

Changed the tag to <podcast:txt>. Also changed the attribute name from service to purpose and expanded the description of it. How does that feel, compare to service? I'm trying to clearly express that this tag is intended to be completely generic and multi-purpose - so not tied to any pre-conceived bias as to what "should" be in it.

DaveHamilton commented 2 years ago

@daveajones I think purpose makes a lot of sense, and de-compartmentalizes it nicely. This reads quite well to me!

mijustin commented 2 years ago

@daveajones that's perfect 👍

theDanielJLewis commented 2 years ago

Can we do anything other than podcast:meta even podcast:metadata? Not only does "meta" mean "She's Dead" in Hebrew, it is also the current name of a very big Web 2.0 company that I'm helping to sue right now.

"Meta" is an extremely popular term that already applies for things like this. For example from my own site:

<meta property="og:description" content="Giving you the guts and teaching you the tools to podcast! Award-winning podcast about podcasting and podcasting resources created by Daniel J. Lewis.">
  <meta property="og:site_name" content="The Audacity to Podcast">
  <meta property="article:published_time" content="2019-05-25T15:16:43-04:00">
  <meta property="article:modified_time" content="2020-10-19T14:37:40-04:00">
  <meta property="og:updated_time" content="2020-10-19T14:37:40-04:00">
  <meta name="twitter:title" content="The Audacity to Podcast">
  <meta name="twitter:description" content="Giving you the guts and teaching you the tools to podcast! Award-winning podcast about podcasting and podcasting resources created by Daniel J. Lewis.">
  <meta name="twitter:image" content="https://theaudacitytopodcast.com/wp-content/uploads/2019/06/tap-social-wide.png">
  <meta name="twitter:site" content="@theDanielJLewis">
  <meta name="twitter:creator" content="@theDanielJLewis">
  <meta name="twitter:card" content="summary_large_image">

This convention is already prominent for this exact purpose on coded content. I think "txt" will lead to confusion and possible errors.

Tool: "Put this in your podcast TXT tag." Podcaster: "I put it in my tag, but it's still not working."

RSS feeds are XML code, not DNS records.

tedhosmann commented 2 years ago

Tool: "Put this in your podcast TXT tag." Podcaster: "I put it in my podcast:text tag, but it's still not working."

If you ever run into something like this that is user facing then you've already done it wrong. This tag is for the collective industry to agree on how they want to add any text to a field that can be consumed by another machine, not a human.

I already have a process in place to give a podcaster a challenge token to put in the RSS feed. Our process is to ask the podcaster to put it anywhere...the copyright field, the next episode description, the show title (it doesn't matter) and we will look for it. After this tag is in place I can tell a podcaster on Transistor to add this verification tag in their Transistor account and @mijustin will have implemented this in a way that there is a field for show verification. The podcaster never has to know what the field was called or why.

mijustin commented 2 years ago

This tag is for the collective industry to agree on how they want to add any text to a field that can be consumed by another machine, not a human.

Yup! Exactly.

The use case is Apple Podcasts telling a podcaster they need to put a verification code somewhere.

This gives us a way of mapping a user-friendly interface to a tag on the backend.

image

It's largely for podcast hosting companies to implement.

daveajones commented 2 years ago

The name is fine. It will never be seen by users, and developers read documentation.

If there are now show-stopper comments after a bit on this, I'd like to push it forward and formalize it by the end of October. We can let it bake until 10/31 and make a decision.

tedhosmann commented 1 year ago

In the case of adding a verification code for a specific service, should we agree on maybe a common 'purpose' attribute to limit the complexity in hosting provider UI? @mijustin has shared a form element just labeled verification code but it doesn't have any specific platform identified.

Do the hosting providers just drop in verification codes without a purpose per the example in the proposal? Or do we try to define something like purpose="verify" for this usage?

In other words, should there be a reserved list that can be created for purposes?

jamescridland commented 1 year ago

When used in DNS, this is a freeform piece of text, and I wonder of the usefulness of the purpose tag. I like keeping things simple and wonder if it's required here - the use case for a provider appears to be as simple as "look for this bit of random text in the TXT element", so I don't feel that it's required at all, personally.

Perhaps it might be gently suggested that TXT entries are identifiable (so don't just ask for "D3F5D2" but ask for "PODCHASER:D3F5D2") - which would mirror what happens in TXT records in DNS?

daveajones commented 1 year ago

My thinking with the purpose attribute was to straight away avoid what I consider the only real annoyance with TXT DNS records - namely that the lack of an identifier for the purpose leads to complex parsing of the value string. I’m, of course, thinking of this from the standpoint of how it’s going to be parsed for ingestion (by us and others). Adding the purpose attribute as an optional parameter, to me, has these benefits:

With those concepts in mind, @tedhosmann’s idea is what I was hoping would happen. Codifying the verify purpose as a known use case in the documentation here wouldn’t preclude anyone from doing anything differently, but would just be a recognition from the industry that this is how we’ve all chosen to handle this particular issue.

I’ve always wished that the TXT record had an attribute on it so when parsing DNS replies it could be easily filtered for things like “spf” instead of a loose, grep style thing.

daveajones commented 1 year ago

And, just to reiterate, the purpose attribute is optional in the spec. So, no barrier there.

mijustin commented 1 year ago

@daveajones if it's optional I think it's fine.

@tedhosmann my guess is most hosting companies will:

just drop in verification codes without a purpose per the example in the proposal

daveajones commented 1 year ago

If no more comments are made on this tag before November 1st, I'd like to recommend this for formalization into the namespace. This tag is simplistic, has lots of prior art and has been discussed off-line for quite a while now. I don't think there are any lingering surprises here. There is also a timing issue to go ahead and get this on the books before end of year (2022) since early 2023 is when email addresses will most certainly start disappearing from feeds.

daveajones commented 1 year ago

Buzzsprout already removed emails from their feeds today, so I am going to fast track this. It's written up now. Some eyeballs on this would be good, to make sure there are no typos or mistakes, and to make sure it reads well: https://github.com/Podcastindex-org/podcast-namespace/blob/main/docs/1.0.md#txt

tedhosmann commented 1 year ago

Looks solid. Thanks so much for hearing me out on this one at PM Dallas - and now it’s already formalized. Cheers.

theDanielJLewis commented 1 year ago

I'm trying to foresee more possible uses of this tag. Should we maybe adopt something either more specific than purpose, like name; or something more generic, like label or field?

Should we give guidance on how long the txt tag should remain in the feed? Indefinitely, like most HTML tags and DNS records; or temporarily, like Buzzsprout now does with emails?

daveajones commented 1 year ago

I don't want to change naming at this point. What's there works fine.

For expiration time, are you speaking specifically about the "verify" purpose? Otherwise, it would be impossible to make a blanket recommendation on that given the open nature of the tag. For domain verification using DNS TXT, I think most people never remove them. I see that debris hang around a lot.

jamescridland commented 1 year ago

This looks great and the name is super.

For domain verification using DNS TXT - not only do most people not remove them, but removing them can stop the domain being verified. This allows, for example, someone to sell a domain to someone else, and not remain verified in things like Google Analytics, etc. But that's probably up to the consumer of the TXT field, and does not need to be part of the specification as far as I can see.

jamescridland commented 1 year ago

PS: @mijustin if your screenshot above is your planned UX - worthwhile considering that podcast:txt can be a multiple value. It's possible for there to be a number of verification codes to be in the RSS feed (say I want to verify with DirectoryA, DirectoryB and ServiceC). It's also possible for those to be required in the feed in perpetuity to allow the podcast to remain verified. Not sure your UX reflects that: it should probably be "verification codes", and allow users to add multiple codes...

theDanielJLewis commented 1 year ago

DNS verification is often reconfirmed and thus the need to maintain the DNS information indefinitely. But something like a podcast-verification code is short-term.

Maybe we can encourage developers to offer a timeout option when a podcaster populates the field, with a default of something like 48 hours before removal. Otherwise, we'll end up with RSS feeds having 20 TXT tags and none of them needed anymore.

mijustin commented 1 year ago

100% on this:

I don't want to change naming at this point. What's there works fine.

Let's get this shipped, and worry about future iterations later.

Again, hosting providers are already having to use other fields (Copyright, Keywords) for these, so the sooner we can have a dedicated field, the better.

For domain verification using DNS TXT, I think most people never remove them. I see that debris hang around a lot.

Even this might not be a big deal. Verification doesn't happen super often. If they need to verify something again, they can replace this value in the field.

In these cases, our customer support is usually guiding the podcaster anyway, so there will likely be some hand-holding no matter what.

@daveajones we'll be adding this to Transistor feeds shortly!

daveajones commented 1 year ago

@daveajones we'll be adding this to Transistor feeds shortly!

🎉🚀

tomrossi7 commented 1 year ago

This is how its done!

theDanielJLewis commented 1 year ago

It'd be great to have some examples of why this might be used in an item instead of only in channel.

mikeneumann commented 1 year ago

I see this tag is a pressure release valve for standardization.

It could be useful as a place to put item-specific verification data as we do for alternateEnclosure's integrity field; across the value block for example. i.e. a hash may be calculated across certain areas of an item, and a service may exist to verify that hash using keys and signatures that may not exist in the feed.

mijustin commented 1 year ago

This is now live on Transistor! 🙌

https://transistor.fm/changelog/verification-code/

apple-podcasts-verification-code-field

@tedhosmann you should adjust the wording on your support emails from:

To transfer ownership, please enter the six-digit authorization code “064012” in the RSS feed’s keywords or copyright fields.

to

To transfer ownership, please enter the six-digit authorization code “064012” in the RSS feed’s verification, keywords or copyright fields.