Closed jamescridland closed 10 months ago
I like this idea. Some academic publisher work the same way. They have their fixed, controlled vocabulary, to put order into chaos ;-) done by their reference librarians, but allow the author to add a low number of own keywords (creativity galore ;-). So we would not be off the tracks regarding the Pomme standard.
Using tabletop role-playing games as an example, this opens up the ability to dynamically add genre tabs like osr or dreamsword or actual play or interviews etc.
But it does also open up the problem of alternate spellings diluting finding content. In our example; ttrpg, tabletop rpg, tabletop role-playing games, table-top roleplaying games and table-top roleplaying games are all valid spellings and get their own tags. There needs to be an easy way to check existing popular categories to prevent proliferation of multiple spellings.
I am always a fan of controlled vocabularies. Doesn't fit the idea of @jamescridland ,but the itunes categories would be fixed (at least).
This feels more like topics than categories. Maybe we abandon categories and go with topics?
My first thought was also the mis-spellings and variants issue. I wonder if we could provide a “master list” of current topics that is searchable by podcast creators from host UI’s to make their choices less likely to be “wrong” in that way.
I’m sure that various aggregators and directories would be willing to contribute to such a list in an automated fashion. I know we would, and I’m sure you would too @jamescridland . We could cook up some code to coalesce the various input lists coming from the different contributors.
Just brainstorming here.
I think having a "master list" would be good, and maybe it's something that should be maintained in a repo devs could pull for easily updating to new additions.
The app could read these tags—or whatever we call them—and properly organize into any pre-supported categories, or they could optionally support one of their own choosing.
So, for example, our master list could contain the full representation of the consensus of necessary tags, while app developers could use unofficial categories to raise awareness. (After all, categories in Apple Podcasts are not about raising awareness, but about organizing content.) So the app developer could, if they want, make an easy "US Election 2024" category to automatically highlight such podcasts, but the organizational category would fall back to "politics" or "government."
We would need a repo not only for developers, but also for podcasters. It'd be great if Podcast Index could syndicate that, like "Here are popular unofficial tags." This could naturally help reduce the alternate spellings or formats. And developers could easily get a glimpse of what they might want to support of their own initiative, and Podcast Index could also see the actual demand—by usage—for categories to be added to the master list.
Would this accept characters like emoji? Standard UTF-8?
Is this a valid topic?
🚁🏞️📱🤛🏼🦁🤚🏾👨🏻🤝👨🏿🌌🈚🤼🏾♀️🌹🇲🇭⚒️🦊🏴🇸🇾🦞🧚♂️👷🏼♀️👳🏼♀️🇮🇸🖊️🤟🏿👨🏽🍳👩🏾❤️💋👨🏾🕵️♂️🧑🏿🦳🏃🏿♀️🆕🇬🇧❌🧑🔬😡🐂👨🏿🎓🧑🏻⚕️🐩🤹🏽☚🏿👨🏾🤝👨🏽🙋🏼♂️🀊👳🏾♂️🧠🤟🏻🎏👨🏾❤️💋👩🟫👨⚕️🦗👨🏽🦰🇲🇶🤾🏾🥍🤦🏾♂️🕵🏼♀️🧗♀️🇳🇦🧑🌾🇧🇭🤹♀️🥚🧑🏾⚕️🥿🥑🇺🇾👩👨👧👐📪⛹🏽🤵🏻♀️🏦🎍☔💈👊🏼🌁🧑🏻✈️🇮🇩🗝️🔥🤵🏾♀️🔉🧑🏾🦽🕵️🧏🏾🗳️🧝🏾♀️☜🏾👨🏻🔧👩🏿💼🧑🤝🧑🧑🏽🚒🇳🇮🏭💁🏿🧓🏿🇹🇻🕶️💇♀️🙏👧🏻🐮💏🏿🏋🏼♂️⏭️👷🏿♂️🧙👩👩👧👶👱🏾♀️👨🏽❤️👨🏿🀕👩🏻❤️👨🏿🧍🏻🇱🇸😑👨🏼⚕️🎊📛👩🏾🏫💿🦋🚶🏾♀️🧍🏼👯👋🏾👠🥑🆕🧗♀️
Just adding my two cents here, but we also have the standard rss category tag for free form topics : https://validator.w3.org/feed/docs/rss2.html#ltcategorygtSubelementOfLtitemgt
we also have the standard rss category tag
Great point. Why not use this?
controlled vocabularies
I'm a fan of those too. In this case, vocabularies like this appear to self-control themselves quite well. If you get the UX right, a podcaster would choose the categories that are most popular. It's certainly how I set categories for my Medium posts. ("Podcasts or Podcasting? One has 22,000 followers, one only 5,000. I'll choose the more popular.")
No reason why someone couldn't provide a list of synonyms or misspellings to assist this, but that's something that would need to be humanly curated.
PS: @agates I guess anything accepts emojis. Remember: the point of this is to make sure you appear in categories which people are using to discover new shows. Emojis are unlikely to be the right choice.
@jamescridland I understand that, but developers should be prepared to handle cases like that for UX.
I would also like to see this where search results can result in long tail hits, such as 'Science' 'History' 'Octopus' would result in the one podcast about the History of Octopus Science. This might also be good for something on the item level, so an episode about the History of Octopus Science would show up in a search.
I’ve been thinking on this since dinner. Here’s what I’ve come up with. Sorry if I'm rambling, it's been a day :)
From a computer science perspective, free-form categories are more satisfying because they don't tie the namespace to any current trends or culture. With no boundaries, new categories can wax and wane with public interest and use. In theory, anyway.
In practice, categories of unlimited form and detail are a big mess. At the logical extreme, the categories of a podcast might be so niche that the tag effectively just another <podcast:guid>
. Anyway, if you want to use a free-form list, you don’t need a new tag to do it. The vanilla <category>
element already provides that... Sort of. In my quick survey, the <category>
element was uncommon in podcast feeds, and when it does appear, it’s usually a mirror of the ubiquitous <itunes:category>
element. Yes, the free-form list tag, as far as I can tell, is only used to echo the Apple-curated menu of categories.
So why is this the <itunes:category>
element the de-facto standard in podcast tagging instead of <category>
? I don’t think this is because <category>
was forgotten, or some slick move by Apple tricked people into not using it. It’s because there’s a natural appeal to limited choice. It’s like how people prefer to order from a menu. In other words, we know that the nature of <itunes:category>
is to pigeonhole podcasts. It’s successful because on some level, a fixed vocabulary is just necessary to make sense of things.
That said, the itunes categories aren’t actually static. They’ve had to edit the menu to keep up with the times (most recently in 2019) and it’s doubtless that given enough time, Apple will have to edit them again. See: https://podnews.net/update/podcast-categories-changed for a refresher. We should not be relying on a single authority to tell us what is important, or what phrasing is cannon, or what terms we describe ourselves by.
Still, I think we can learn something from Apple’s category design: Categories aren’t flat lists, they are hierarchies. Each is rooted by an enduring category (fiction, health, kids) and branches to more timely ones (drama, nutrition, stories for kids). Each higher branch is just finer specificity. If the podcaster wanted, they could extend the “health -> nutrition” chain to “health -> nutrition -> keto diet”. This makes each category into a prioritized list, not unlike how font-families work.
As a reminder font families define a hierarchy of fall-backs to common fonts when the specialized one isn’t available. For example { font-family: Helvetica, Arial, sans-serif} will render text in Arial if the browser doesn’t know what Helvetica is. I think this is a good mechanic to copy, because even podcasts with hyper-specific categories would be guaranteed to fall-back to something “normal” eventually.
With this in mind, here are my proposed modifications:
<podcast:category>
, to keep convention with <category>
and <itunes:category>
<podcast:category>
elements may exist per <channel>
. The ordering is emphatic, with the most significant element first.Examples:
<podcast:category text=health, nutrition, keto diet />
<podcast:category text=leisure, table top games, old school renaissance />
<podcast:category text=music, metal, thrash metal/>
In this case, a parser which doesn’t understand “old school renaissance” would at least know we’re talking about table top games, or -worst case- the general category of leisure, and could try its best to put us in the right place. Compatibility with all <itunes:category>
is there, so long as their base categories are a sub-set of ours. But it also gives free and open podcasting room to grow as it chooses.
So basically tl;dr what @jamescridland said, except different.
That’s an interesting idea @ablekirby. I like the concept of splitting the concern in two and having a fixed list for sanity and then a free form for user control. That’s essentially what iTunes:category + iTunes:keywords is supposed to be. But this would be in a single tag.
I also keep coming back to the notion that limited choice is attractive to people. The “menu” as you call it. But I’m also sympathetic to the idea of not formalizing it out of use.
🧐…
@ablekirby Good write up, and makes a lot of sense. I like the idea of the first category needing to be one of the predefined major categories, but the subsequent categories can be user defined for fine tuning. This allows for backwards compatibility, which is important.
Rather than the rightmost category being the true category, I'd prefer all three categories to be the true category.
<podcast:category text=leisure, table top games, old school renaissance /> <podcast:category text=leisure, cosplay, old school renaissance />
If I'm understanding you correctly, in your implementation, both of these categories would bring up 'old school renaissance' podcasts, but what I'm actually interested in is the table top games and not the cosplay. If there's 100 'old school renaissance' podcasts, and 100 'table top games' podcasts, but only one of those is both 'old school renaissance' & 'table top games' it may never come up in a category search. Something that takes all the categories into account would make sure finetuning of category searches would be possible.
This is a nice idea.
The reality is that iTunes categories will remain in RSS feeds for many years to come. I don't believe that we need to look at this in isolation, then - we have the formalised categories (thank you Tim Apple), and we can have fluid, unformalised categories with this specification.
Alternatively, we use keywords to achieve what I think I'm trying to do; and retain the Apple categories in the feed as well.
I'm unconvinced of the need to build an alternative formalised master list of categories. The Apple one isn't perfect, but there's a degree of wheel-reinvention that I'd be eager to avoid.
I'm unconvinced of the need to build an alternative formalised master list of categories. The Apple one isn't perfect, but there's a degree of wheel-reinvention that I'd be eager to avoid.
We could just take Apple's list and create a 'fork'. Then you'd have a good starting point (not the work) while having flexibility if (in future) there's any request for change that Apple might reject/not consider. The question with any curated list, though, is how decisions are being made (a simple vote in a repo)?
I'm against a total free style of categories (even if it's additional to the iTunes categories). Besides that you get 🐈egories people will use similar words, synonyms or different spellings/singular/plural to describe their podcast. It will be to open and the structure will be bad or even worth then now. Controlled vocabulary is the way to go. It brings order into chaos. Think about the user experience and discoverability. It's SEO in the wrong place. They can do that in the description. Hashtags for podcasts (see YouTube).
As a host with peertube video feeds, I'm not going to modify the category system to fit into any officially proposed "list" and I imagine many other non-podcast hosts are in the same situation.
If we're going to indicate hierarchy within a single tag, I think we should use colons, not commas. For example, <podcast:category text="health:nutrition:keto diet" />
Commas usually denote a list, not a hierarchy.
people will use similar words, synonyms or different spellings/singular/plural to describe their podcast. It will be to open and the structure will be bad or even worse than now. Controlled vocabulary is the way to go. It brings order into chaos. Think about the user experience and discoverability.
Exactly, the world will end up with 30 categories/topics/hashtags describing the same thing. Or categories that are actually just a brand name for some company trying to sneak in some exposure. Here's a decent list: https://en.wikipedia.org/wiki/Wikipedia:Contents/Categories
I like standards, it makes our stuff more interoperable.
What I think you need here is an ontology gives you a bit more than just a controlled vocabulary, see this for music for example: http://musicontology.com/specification/
I'm seeing some people asking for "tabletop games" as a category, and doubtless we're likely to have more arguments about categories in the future.
Could I humbly suggest we're doing this all wrong?
We shouldn't be setting categories. That's not our job. We should be allowing podcasters to set them for us.
Here's an alternative:
podcast:categories
Specification:
Fallback: If this field is empty or not present, the podcast client may use the three categories in the iTunes tag as default values.
Example:
This is a free-form list of categories. It's expressed as tags, like Flickr does photo tagging, or Twitter does hashtagging, or Medium does with subjects, or Unsplash does in searching.
It has a maximum of 128 characters, just like Medium has a maximum of five subjects per post, or Twitter has a limited space. This cuts down on spamming, and makes it impossible to be in every bloody category on the planet.
It also suggests that the 'most important' category is first in the list. This might be used by search algorithms to give more weight, in my example, to "daily news" than to "snarky remarks".
It means that if you want a category called
cooking with cheese
, then you can go right ahead and make one. If enough other people like the idea of the category, then it'll be a popular category.Categories can also be in the native language of the podcast, too. A search for
saucissons
will find me French-language podcasts about sausages. (You could use that in conjunction with thelang
tag, too).These categories, more properly called 'tags', are well-understood by many creators on the internet, and show clear blue water between Apple's intransigent categories where you need to run a campaign to get them to add just one of them, and where we want to be, which is a podcast service to everyone.
These are self-policing, fluid and reactive. Want a BLM category? Go for it. Want a Trump category? Be my guest. Want a category for impeachment? For snowstorm? For power outages? It's a super-simple way to tie podcasts together on a common subject.
And, most importantly, "if you don't support them, Mr Podcaster, then we'll just use your iTunes categories as your categories here, too". Simple and easy, and already has 100% support since everyone has set up to three categories as a starting point.