Podcastindex-org / podcast-namespace

A wholistic rss namespace for podcasting
Creative Commons Zero v1.0 Universal
388 stars 116 forks source link

Proposal: <podcast:block> #179

Open PofMagicfingers opened 3 years ago

PofMagicfingers commented 3 years ago

<podcast:block [id="[platform_id]"]>[yes|no]</podcast:block>

This tag allow to specify at channel if a platform is allowed to index this podcast.

This aim to clear the mess we're in with multiple clients using multiple tags : right now, if you use googleplay:block you block Google, but if you use itunes:block you block Apple but also block Google as well.

Platform id is the same as podcast:id tag.

If you do not specify the platform, it should be considered as default value, for platform not specified.

ie: this podcast is only allowed on Spotify

    <podcast:block>yes<podcast:block>
    <podcast:block id="spotify">no<podcast:block>

If there is no default tag, it is inferred as not blocked. Here everyone is allowed :

    <podcast:block id="spotify">no<podcast:block>
    <podcast:block id="google_podcasts">no<podcast:block>

Here only Google Podcasts is blocked :

    <podcast:block id="google_podcasts">yes<podcast:block>

We can discuss if this should be available at item level. If I recall correctly, itunes:block is available at item level.

This could also serve the same purpose as <podcast:private> proposed on #167

This is a transfered proposal from our former project podCloud/podcast-ext. See #173 for details.

swschilke commented 3 years ago

Is that like a robots.txt for the RSS feed? Why would someone want to limit the reach of a podcast?

Okay, for Spotify or other paid services I could understand it.

PofMagicfingers commented 3 years ago

I don't know. I personally don't care which platform index my own podcasts.

However as a hosting provider, with podCloud, we had many podcasters asking us to be able to block some platform to be more in control.

We had this discussion in our original rss namespace project, when Majelan, a French equivalent of Luminary, indexed many podcasts without asking or warning them.

Majelan, was a podcast player that had a free directory of independant podcasts indexed crawled from iTunes and selled exclusive content on the same app. (They now sell personal development content and closed the original app because podcast was not profitable enough)

Many French independant podcasters were pretty angry, it even made the news.

It's pretty common request from our users to get more control on where they "broadcast" their content.

(Although, in my own opinion, that should not matter and you should be glad to have your content available everywhere. Except if an app is stealing your content as free stuff for acquiring user to sell them stuff, but if they do that, I doubt they'll enforce a tag in the rss feed)

The purpose is also to replace itunes:block, googleplay:block etc with a generic one.

jamescridland commented 3 years ago

This is something I proposed here a while ago, so it's nice to see it back.

I can see the benefit of this for people like Majelan or Luminary. But it's for them - well, for Luminary these days, given Majelan exited the podcast space - to work out how they might wish to implement this. I think we can help them with the spec: but more important to understand whether this would be enough for them, or even something they'd like to do. We'll get nowhere if we just invent something for these folks without their input.

Particularly, we need to understand who sets the "id" values. Is that up to the implementing directory owners? I suspect it might be. But my suggestion would be reverse notation for domains - "com.spotify" or "com.google.podcasts" - might be the plan here.

PofMagicfingers commented 3 years ago

I was thinking we could use the same ID as in podcast:id, those defined in serviceslugs.txt

I perfectly agree we can't release a tag like this and hope they'll follow without working with them. I don't know Luminary, and how open they are to discuss something that could lower the quantity of content they can offer in their app.

With majelan when French public radio asked them to be removed from their app, majelan refused to do it. (Thus refusing control over their own copyrighted content)

(Now most public radio content is not available anymore in RSS, and only on private apps, partly because of this.)

benjaminbellamy commented 3 years ago

Are we sure this tag would have a legal value? If not, what consequence would it have in a courtroom?

I suggest we focus on the podcast:license tag and define clearly what is legally (and thus technically) allowed to do.

The main issue with Majelan was that they were (allegedly) making money out of others podcasts. With a clearly defined license - for instance (CC BY-NC-ND 4.0) - defining what is permitted things would have probably gone smoother.

This has yet to be defined, I suggest we use sub-tags for podcast:license.

daveajones commented 3 years ago

I suggest we focus on the podcast:license tag and define clearly what is legally (and thus technically) allowed to do.

I think you may be right about this Ben. I do like the idea of the block tag. But, after being subject to so many "please take my podcast off your service" emails, I question whether people would actually use it properly. Idk. It's a hard issue. I'd rather just say "properly license your content and then go after who violates that."

This has yet to be defined, I suggest we use sub-tags for podcast:license.

Can you explain what your idea is with this?

benjaminbellamy commented 3 years ago

I answered in #177 : https://github.com/Podcastindex-org/podcast-namespace/issues/177#issuecomment-796021609

daveajones commented 3 years ago

Revisiting this.

Do we want to work on finishing this tag? I’m not sure how respected it would be. But I’m willing to do it if enough of us feel it’s a worthy addition.

PofMagicfingers commented 3 years ago

Well, I guess it will be more respected if it exists than if it doesn´t :grin:

I'm sure some apps and directories, built by people here, will respect it. It helps to understand which feed is private and should not be publicly indexed.

I doubt "the big four" (iTunes, Google, Spotify, Deezer (maybe mostly big in France)) will respect it right away, but it doesn't hurt to have it in the standard. And Spotify and Deezer might follow the lead if many use it. iTunes and Google will probably go their own way as always

benjaminbellamy commented 3 years ago

I agree with @PofMagicfingers. I do feel it’s a worthy addition. Ben.

daveajones commented 3 years ago

@PofMagicfingers Are you good with the structure as it stands? Any tweaks that you think need to be made? If not I’ll write it up into the main Readme for phase 4.

PofMagicfingers commented 3 years ago

It seems good to me !

daveajones commented 3 years ago

Ok will write it up this weekend.

benjaminbellamy commented 3 years ago

Hey @daveajones, I listened to Podcast 2.0 #38 and now I must say agree 100% with you and Adam. <podcast:block> will probably do more harm than good.

PofMagicfingers commented 3 years ago

Hey @daveajones, I listened to Podcast 2.0 #38 and now I must say agree 100% with you and Adam. <podcast:block> will probably do more harm than good.

Could you summarize their opinion or tell us the timecode of when they discuss about it? The episode is about 2 hours long.

benjaminbellamy commented 3 years ago

Just a second, je te trouve ça…

benjaminbellamy commented 3 years ago

From: 1121 01:13:49,680 --> 01:13:52,650 About that is the block tag.

benjaminbellamy commented 3 years ago

Until: 1355 01:29:13,200 --> 01:29:14,040 That's right, Johnny.

PofMagicfingers commented 3 years ago

Here are my thoughts on it :

I didn't realized block tag has been revived because of podcastclips (never heard of it), I though we were discussing it again to prevent directories and apps to index the "wrong" feed when using guid #251 and for podcast channels #240.

Although, I understand the concern about letting companies decide where the podcasts would be listenable, this is obviously not the point of this tag. I hope users of platforms doing such thing will move their podcast elsewhere if they loose control over their hosting provider.

Let me summarize, what we had in mind when we imagined this tag at podCloud :

  1. replace itunes:block, googleplay:block etc. If we are building a common standard, it should also take into account the existing tags in the wild. If every platform is using its own tag, one day we will have 10 mycompany:block tags to add in a feed if we want this feed to be private
  2. Private feeds : as far as I know, most of the time itunes:block is used to prevent indexing of a private feed : premium subscription, user-tailored feed, etc. This is why this tag permits a "catch all" version, blocking the entire feed for every supporting app.
  3. Opinion : a minority of podcasters have strong opinion about where their feeds should be indexed or not. Some hate Apple, some hate Spotify, some only want their own website to list their podcast. It's their opinion and if this tag can help them to enforce it, as long as it is respected : why not !

About the tag being respected or not, as you said in the podcast, it's like the robots.txt, there might be issues about which settings are defaults in some companies*, and there might be issues about companies not respecting it.

*I think we should advise in the spec that the reasonable default is no block tag for a public feed, for maximum compatibility

IMHO, we cannot foresee everything, and only hope it will be respected and used for the original intent.

As you said, if it is misused, as indexers, we would still be able to not enforce it for hosters that abuse this system. Like robots.txt, expire or user agents headers, etc, it's a convention, and it only works if everybody respects it.

No solution is perfect, if tomorrow, Anchor or Libsyn decide to block every request not using iTMS or Spotify user agent, you will probably spoof this user agent and call them out on it.

IMHO, the main purpose about this tag should not be to "block" bad platforms or indexers, as it will be obviously not respected by them. I think the main purpose is to declare inside the feed if it's a private or public feed. Or a platform specific feed ( blocking every app except spotify for a spotify only feed, as it seems to be a thing now )

benjaminbellamy commented 3 years ago

I am no expert, but here are my two cents: As I said I now have very mixed feeling about the block tag, to say the least. What I still think though, is that we still need a license tag that tells what is allowed to do and what is not. As Adam said, PodClips is more of a copyright infringement issue. There may be new ones. The block tag won't protect against the next PodClips, the License tag could. On the other hand, some podcasters may be totally fine with what podclips is doing. Creative Common allows you to specify all that. So the License tag would solve it all.

PofMagicfingers commented 3 years ago

I do agree with you, but I don't see why theses 2 tags would exclude each other.

A license tag tell us how we could use the content.

A block tag is a way to prevent indexing of a private feed, or a custom purpose feed in directories.

Podcastclips has nothing to do with the block tag, but maybe should respect a license tag if present.

Tldr : Block tag is keeping control on indexing, not on content usage. License tag is about content usage.

Let's not mix up licensing, indexing and copyright.

benjaminbellamy commented 3 years ago

Yes of course. My point was just that the main reason I can think of why one would want to block someone is for license and copyright infringement issues. “I don't like you” does not deserve a tag if you ask me. (But maybe you're not 😉)

PofMagicfingers commented 3 years ago

“I don't like you” does not deserve a tag if you ask me

That's also my opinion. :wink: My opinion is that your podcast should be indexed by many platforms as possible, but it's not everyone's opinion.

Also I sure hope the main purpose of this tag will be blocking indexing on catalogs when it's not relevant (private feeds, old feeds, etc), and not because you like the catalog or not. That is after all, the original purpose of itunes:block and googleplay:block.

daveajones commented 3 years ago

The real pain point here is the ability to block everything and then specify only which ones you allow. That’s where the abuse (or naive config) would most likely occur. Removing that hurts the tag quite a bit from its current elegance though.

I want to move forward with this tag though. Let’s brainstorm a bit more to see if we can work around that single issue without making it unusable. I’ll give it some thought.

daveajones commented 3 years ago

To be clear, if we improve that aspect I think it’s fine. I don’t want to ditch the tag. But that part really does warrant some thought. It will happen.

thebells1111 commented 3 years ago

I may be changing my mind on the podcast:block tag. It’s not just for the podcaster, it’s for the app developer.

I tried subscribing to ‘Pod Save America’ using CurioCaster, and it works, but the enclosureURLs for each episode is blocked. I tried on PodFriend, same thing, then Google Podcast, same thing. If I go to the address direct from the browser, a redirect occurs to a different url. Looks like the podcast host is doing something behind the scenes to make sure other players can’t play the content.

The block tag would improve my UX by allowing me as a developer to remove that feed from search results so the user isn’t able to subscribe to a podcast that my player won’t be able to play.

It seems like the podcast host is already able to block players they don't want playing their content. It seems pretty trivial to look at the referrer in the request header, and block anyone you want. Unless I want to hide that info by doing a server side proxy(which I don’t), then the podcast host can easily block anything referred from curiocaster.com. The block tag will just allow me as an app developer to know they don’t want my participation, and block their feed from being displayed

jamescridland commented 3 years ago

the main reason I can think of why one would want to block someone is for license and copyright infringement issues

This isn't what podcast:block is for, though. If you want technically block, say, Podclips from your service, then use an .htaccess to block the RSS feed directly. But thieves are going to steal.

@thebells1111 ...

I tried subscribing to ‘Pod Save America’ using CurioCaster, and it works, but the enclosureURLs for each episode is blocked.

This is because Megaphone is blocked by most ad-blocking software (and has been added to the default for eero over the last few days, I understand). https://podnews.net/podcast/imvy says that Pod Save America goes through at least two redirects - Podtrac and Chartable - before hitting Megaphone's DAI server.

This isn't the podcast host doing the blocking - rather differently, it's the podcast host being blocked.

thebells1111 commented 3 years ago

@jamescridland

This isn't the podcast host doing the blocking - rather differently, it's the podcast host being blocked.

This has been confirmed. Everything is blocked when using Brave, but on Edge or Chrome, it works as expected.

jamescridland commented 2 years ago

Hello, all - would like to reheat this proposal.

Currently, the intention when podcasters use <itunes:block>yes</itunes:block> is to, I think, block all podcast directories from indexing this podcast. Apple's spec, like any Apple spec, only talks about Apple - but <itunes:block> is also respected by Google Podcasts and Pocket Casts that I'm aware of.

Respecting all creators, even those unable to use the podcast namespace, I would like to propose that the tag is as written, with two treatments of existing tags as standard:

The tag

<podcast:block [id="[platform_id]"]>[yes|no]</podcast:block>

Backwards compatibility

Where a podcaster is not using the podcast namespace, we should treat the following as synonyms as follows:

<itunes:block>yes</itunes:block> MUST be treated by all podcast directories as <podcast:block>yes</podcast:block> - i.e. it should remove this podcast from all directories.

If a podcaster wishes to remove their podcast from Apple Podcasts only, they should use <podcast:block id="apple_podcasts">yes</podcast:block>.

If <itunes:block> and <podcast:block> is present in the same feed, the <podcast:block> tag should always take precedence, and the <itunes:block> directive should be ignored.

Update Feb 16: I've removed the googleplay namespace here, since Google have themselves removed it.

daveajones commented 2 years ago

I support this. Right now it’s impossible to block apple but not google using the iTunes:block tag. I know that slug lists are annoying. But I don’t see any better solution here. A list is really necessary to achieve granularity.

theDanielJLewis commented 2 years ago

Google will respect the "iTunes" block. So there's really no way to block only Apple Podcasts while allowing Google, under the current "iTunes" namespace.

But instead of a separate yes/no tag for each desired platform, I suggest a more concise way to do this.

daveajones commented 2 years ago

I like the whitelist only approach. It simplifies things. I would say that as soon as the block tag is present at all it should be interpreted as “block everything”. Then the exceptions list is who to allow.

My big question here though is how to do this without some kind of slug list. Nobody likes having to reference and maintain a list over time. Can we just use a domain name minus the tld? So, to allow google.com it’s just “google” and to allow apple.com it’s just “apple”, etc.

The robots.txt spec uses the bot user-agent identifier. But in the podcast world that’s very messy. Apple alone has 3 different user agents their aggregators use.

tomrossi7 commented 2 years ago

I like the whitelist only approach.

We just need to make sure we consider the impact this would have one new podcasting apps. I would hate to accidentally squash any innovation!

Can we just use a domain name minus the tld?

I definitely think something related to domain names is a better solution than slugs.

One thing I keep thinking is are we using XML correctly? We all hate it, but its what we have! A tag with no content and complicated attributes may be more appropriate like this? I dunno?

<podcast:permissions>
    <podcast:block>
      google
    </podcast:block>
    <podcast:allow>
      apple
    </podcast:allow>
</podcast:permissions>
PofMagicfingers commented 2 years ago

Hi everyone,

Here are my thoughts on all of this. I'm thinking all of this with the specs written by @jamescridland here, and of course my initial intention when we proposed this tag, you can read above here.

@theDanielJLewis

@daveajones I do understand the concern of maintaining a slug list, but actually we'll never have to maintain it 😃

At first we can use the already existing slug list of podcast:id : serviceslugs.txt. We're building this tag but it's only working on the goodwill of the players/platforms.

If a platform doesn't support it, it doesn't matter to have a slug for them, as they will not enforce the rule. If a platform does support it, they will most likely do a PR/issue here to add their name to the slug list.

@tomrossi7 I feel this kind of syntax are very verbose, and I like a shorter more direct one tag with attribute style, but it's only my opinion, and maybe we are indeed using XML "the wrong way". Although, I don't see a way to specify a default permission in your way. Maybe using a default slug

theDanielJLewis commented 2 years ago
<!-- Only Spotify is **whitelisted** -->
<podcast:block>yes</podcast:block>
<podcast:block id="spotify">no</podcast:block>

That seems like extremely confusing syntax, like a double negative and some redundancy.

I think there should be only one instance of the block tag on the channel level (items can have their own block tag).

As for maintaining a list, I think we might have to leave it to the developers. Blocking Spotify is going to be pointless unless Spotify ever respects it. It's not quite like the lock tag.

So maybe developers would pull-request on a repo list to claim a "slug" (whether that's their app name or a sometimes seemingly irrelevant domain name) and indicate their support for the tag.

Otherwise, this seems like an almost pointless tag with not even the value of indicating an unregistered trademark with ™.

(By calling it "pointless," I'm not saying it's actually pointless or we shouldn't pursue it, only that if we don't get the podcast-app developers involved, there will be no point to this tag.)

jamescridland commented 2 years ago

(I've amended my suggestion to remove the googleplay namespace, since Google themselves have removed it).

I agree with Daniel, to a point: yes, it's useless if podcast directory devs don't implement it; but inclusion in the RSS means that it's an expressed intention from a podcaster, which they are asking directories to honour. As such, I think it's a little less pointless than it at first seems.

As one example: I'm currently in the receipt of a three-page legal letter from a lawyer, threatening me with all kinds of legal action, because a podcaster wants to be only listed in Apple Podcasts and not in any other directory. That's a stupid thing to want to do, but we should also respect the wishes of creators, even stupid wishes, in my opinion. At present, there's no way of indicating that a podcast should only be in Apple Podcasts.

As to the complicated syntax above - I like Tom's idea of...

<podcast:permissions>
    <podcast:block>
      google
    </podcast:block>
    <podcast:allow>
      apple
    </podcast:allow>
</podcast:permissions>

Question for @tomrossi7 - a) what does the podnews directory do if it sees the above? It's not explicitly allowed, nor is it explicitly blocked?

Does it help if there is a reserved directory slug of all for clarity?

<!-- Only let Apple and Spotify through -->
<podcast:permissions>
    <podcast:block>
      all
    </podcast:block>
    <podcast:allow>
      apple, spotify
    </podcast:allow>
</podcast:permissions>

Finally, I'd recommend not using domains for all this - there are some strange old domains going round, and you actually want to talk about directories/services and not the domains they may hang off. Indeed, some podcast apps don't even have a website.

daveajones commented 2 years ago

Moving this into phase 5

daveajones commented 2 years ago

I agree with Daniel, to a point: yes, it's useless if podcast directory devs don't implement it; but inclusion in the RSS means that it's an expressed intention from a podcaster, which they are asking directories to honour. As such, I think it's a little less pointless than it at first seems.

This is my feeling on it as well. Expressing the intent of the feed owner is a very legitimate thing to do.

daveajones commented 2 years ago

At it's core, for backwards compatibility, this tag syntax must perfectly match <itunes:block>. That seems like a non-negotiable, so that feeds can just change the namespace postfix and be done.

That means we're only left with modifying purpose via attributes. The syntax might not be perfectly human readable, but that's ok since XML is a markup language, not WYSIWYG.

Thinking this over...

daveajones commented 2 years ago

This feels good to me:

<!-- This means "block everything" -->
<podcast:block>yes</podcast:block>

<!-- This means "block nothing" -->
<podcast:block>no</podcast:block>

<!-- This means "block everything except spotify and google" -->
<podcast:block exclude="spotify,google">yes</podcast:block>

<!-- This means "block nothing other than spotify and google" -->
<podcast:block exclude="spotify,google">no</podcast:block>

In it's basic form it operates just like the itunes block tag. When the exclude attribute is present, it is taken as an inversion of the intent expressed in the node value applying to those specific platforms.

This also has the benefit of being easy to parse from a UI down to XML.

PofMagicfingers commented 2 years ago

Not really a fan of those inversion changing the way you read it depending on the content. I'm not sure why/how this would be more useful syntax than the previous one we discussed, but I could be wrong.

I don't see the need to be identical to itunes:block as we are creating a new tag. Retrocompatibility has no point here, as people will use the old itunes:block for retrocompatibility

daveajones commented 2 years ago

I don't see the need to be identical to itunes:block as we are creating a new tag. Retrocompatibility has no point here, as people will use the old itunes:block for retrocompatibility

I'll do my best to verbalize the need. I should do that instead of just saying it, shouldn't I 😊.

The "need" for backwards compatibility is a hybrid of technical and "political" factors. Eventually (3 years, 5 years, 10 years, ??), the podcast namespace aims to be a drop-in replacement for the itunes namespace. The itunes namespace is poorly documented, slow to adapt to change and owned by a single enormous company. It needs to be supplanted - not just added to.

To that end, we need to be conscious of making sure that any tags which mimic the behavior of an itunes tag need to be unaffected by a namespace prefix swap. For instance, we currently have <podcast:season> and <podcast:episode>, both of which would function identically if you simply swapped the NS prefix. I feel like that is critical to future ease of adoption.

For every namespace discussion there are three developer parties involved: hosting companies, listening apps and third party platforms. Each one of those has a different adoption threshold they will cross with different criteria. One criteria they all share though, is ease of implementation. Tags that take zero or little coding effort to adopt will always be preferred over complexity. When a developer group sees that simply changing their feed production/consumption code from "itunes:" to "podcast:" involves zero breakage, that's a huge deal.

With tags that create new behavior (like <podcast:person>), these concerns are not so big. But, for this type of tag (where it directly does something an existing tag already does) I think adoption concerns are the priority.

If all of that is true and legitimate, then the exclusion list mechanic is the only way I know of that this tag could work as intended (as a granular, platform level blocking mechanism) and satisfy those other "needs" above.

daveajones commented 2 years ago

@PofMagicfingers What if instead of having an allow/disallow mechanic, we just do disallow. So, we remove all the notion of “no” as an option. This leaves it as just “yes” blocks everything, and if “exclude” is present then allows those.

Do you think that would work for you? I’d like to move on this tag but I don’t want to if you still have concerns about it. I agree that the inversion was probably too complicated.

PofMagicfingers commented 2 years ago

Well, that could work... But IMHO it's a bit harder to understand/read, and it fails to be a drop-in replacement of itunes:block: right now, when you use <itunes:block>yes</itunes:block> it will block every app. (every app following iTunes tags). That's the same behavior with <podcast:block>yes<podcast:block>. Zero breakage.

If you need exceptions, you can specify them with a new tag, using the platform attribute: <podcast:block id="spotify">no</podcast:block>.

I'm not seeing anything confusing or such : "podcast block Spotify? no"

I find it harder to read it with exclude list and thus inversion etc, IMHO.

daveajones commented 2 years ago

Well, that could work... But IMHO it's a bit harder to understand/read, and it fails to be a drop-in replacement of itunes:block: right now, when you use <itunes:block>yes</itunes:block> it will block every app. (every app following iTunes tags). That's the same behavior with <podcast:block>yes<podcast:block>. Zero breakage.

If you need exceptions, you can specify them with a new tag, using the platform attribute: <podcast:block id="spotify">no</podcast:block>.

I'm not seeing anything confusing or such : "podcast block Spotify? no"

I find it harder to read it with exclude list and thus inversion etc, IMHO.

Ok, so what you are saying is there can be multiple tags and platforms just look for themselves as a “no” override if there is a full “yes” block at the top. Is that a fair way to say it?

PofMagicfingers commented 2 years ago

Yes. Platforms looks for themselves or a generic tag. "Am I specifically blocked or allowed ?"

If no tags with my platform id 👉 "Is everyone blocked via legacy itunes:block or generic no platform ID podcast:block ?"

Kind of like a robot.txt, the robot check if it matches one of the rules user agent. If not it follows the * rules.

theDanielJLewis commented 2 years ago

I thought of something else we should consider.

Does a block tag merely prevent that podcast from being found through a podcast app, or will it actually block the app completely, even from manually adding the feed?

For example, if I'm crazy and add the block tag for Overcast, should that prevent people from manually adding my RSS feed to their own Overcast app?

I advise that it be a "directory" block only, not a full app block. Like a bots command for website asks that the website not be indexed, but still lets people access the website, I think the "block" tag should block the podcast only from being listed in the target directory, not preventing fans from using that app to consume the podcast.

With this in mind, I wonder if we might want to consider renaming the tag to "hide."

PofMagicfingers commented 2 years ago

You're right it's indeed only a directory block. It would work just like itunes:block is used right now. Only blocking iTunes store presence, not usage inside Apple Podcast or iTunes through the rss feed link.

It's using the term block to keep the same name as itunes:block

daveajones commented 2 years ago

@PofMagicfingers Does this look good now? Block Tag

PofMagicfingers commented 2 years ago

Yes! It looks clean and easy to understand