bridgetownrb / bridgetown-feed

A Bridgetown plugin to generate an Atom feed of your Bridgetown posts
MIT License
20 stars 7 forks source link

XML <id> element generated incorrectly for custom collections #16

Closed dcr8898 closed 8 months ago

dcr8898 commented 8 months ago

Hi.

I am trying to add bridgetown-feed to my site (rossney.net). Instead of Posts, I use Articles as my main collection. I configured bridgetown-feed to generate a feed for Articles at feed/articles.xml.

The xml file is generated correctly, except for the <id> element.

I do not set an "id" value in my articles' front matter. I expect the content of the <id> element to be the same as the href element, which generates correctly (using .absolute_url).

Possible Cause

I notice that the feed.xml template reads the .data.id (front matter value) or the .id value (if front matter id not present) on the collection item being processed. However, for my article items, this looks like:

"repo://articles.collection/_articles/2016/2016-06-13-recovery-sucks.md"

And this value is then rendered as the <id> element contents.

Unless this .id value is somehow incorrect? I haven't had any other problems that would indicate such.

Possible Solution

I am willing to submit a pull request for this, but I'm not sure what rule to apply. I see that @jaredcwhite made a change to this code last week (to pick up the front matter value of id, if present), but this issue probably existed prior to that (even going back to the fork from jekyll-feed).

Again, is it possible that the .id values for my custom collection are incorrect?

If it's just a matter of using .absolute_url instead of .id for custom collection, I can probably handle that and submit a PR.

What do you advise?

Thank you!

jaredcwhite commented 8 months ago

@dcr8898 so for the Atom feed format, an ID of an entry can be any unique value, perhaps a GUID. In this case we're using Bridgetown's internal ID, which is dependent on the file location of the item.

So it should be fine, I've never seen any problems related to this.

dcr8898 commented 8 months ago

Sheesh. What a poorly-written issue. πŸ™„ My bad.

The "problem" is just that it doesn't pass validation. The Atom spec for id doesn't just require that it be unique, "Its content MUST be an IRI, as defined by RFC3987."

This was my first time adding Atom to a site, so I ran it through a validator and got that message (with a bossy link to the spec). The value of post.id isn't one of the valid IRI schemes. When I looked around (here, for example), I saw that id in the wild is often just a copy of the URL for the resource. Therefore, I patched my local copy of bridgetown-feed to use post.absolute_url and it passes validation because http and https are valid IRI schemes.

I don't have the experience to know if this is a significant issue for Bridgetown in general, or me as a site author. If this is the same behavior as Jekyll-feed, and their post.id is constructed the same, then presumably all of their sites have the same issue. So . . . if it's not a big deal, then there's nothing to fix, I guess. πŸ€·β€β™‚οΈ I don't mind making that small patch on my end to preserve validation.

FYI, while I was patching anyway, I sorted the Posts collection in reverse date order before applying the post_limit (as is done in Jekyll-feed), because I thought taking the most recent collection items seemed more intuitive and was the behavior I wanted in my feed. That's a one-line PR, if you want me to submit it.

Thank you for your work, Jared! I'm a fan. πŸ‘

jaredcwhite commented 8 months ago

@dcr8898 Ah OK, I get what you're saying. I'd probably spend some additional time looking into this if it weren't for the fact that we're in the process of migrating to RSS 2.0 instead of Atom (see #5). To the best of my knowledge, RSS doesn't care what the syntax of its guid tag is as long as it's something unique.

Regarding collections, the feed order should mirror the general configured sort order of the collection. I don't think we should be re-sorting anything in this plugin. But if you want to file a separate issue about that with further details, feel free!

dcr8898 commented 8 months ago

Honestly, Atom and RSS seems so stable and unchanging, I will probably just pin the gem and keep it as-is.

As for the sort order, if I'm not mistaken, the "natural" sort order of Posts is by date, ascending. Is there a way to change this "default" sort order? If not, then the feed will only ever show the first ten posts, chronologically (based on the default post_limit of 10). Increasing the limit would increase this number, but it would still be the first N posts.

I noticed that Jekyll-feed was sorting Posts by date, descending, before applying the limit. I copied that line because I want the latest posts to show in the feed, not the earliest. But, as I said, I may be missing something in Bridgetown config (or elsewhere) that would solve this problem in a different way.