withastro / docs

Astro documentation
https://docs.astro.build/
MIT License
1.3k stars 1.45k forks source link

Is this the best/recommended way to get all customData for RSS feed? #3301

Closed aritraroy24 closed 10 months ago

aritraroy24 commented 1 year ago

Docs enhancement suggestion & Finding the best way to get multiple customData values

● Docs enhancement suggestion: Strictly use customData parameter to get custom parameter values other than predefined ones

I was trying to create the RSS feed for my blogs using this example mentioned in Astro docs. I encountered an interesting thing, which can be added to the documentation for beginners. Like, for me, I was working with an RSS feed for the first time, and I had no clue that you need to use customData parameter strictly to get parameter values other than the predefined parameters (title, link, guid, description, pubDate and content) as in most cases (as many as I encountered) customData can be replaced with someones own customVariable. So, I guess if we can add a comment something similar to this in the Astro docs, it will be a lot more easier to follow.

● Best/recommended way to get all customData with customized rendering

1) To get all the results in one \

tag-

customData: `${sanitizeHtml(marked.parse(blog.data.subtitle + blog.data.duration))}`

The result will be like the below-

<p>“If It Isn’t on Google, It Doesn’t Exist”️️️ 🤷‍♂️🤷‍♂️🤷‍♂️⏱ 9 Mins</p>

2) To get a custom result in different \

tags with some string injection-

customData: `${[sanitizeHtml(marked.parse("Subtitle: " + blog.data.subtitle)) + sanitizeHtml(marked.parse("Duration: " + blog.data.duration))]}`

The result will be like the below-

<p>Subtitle: “If It Isn’t on Google, It Doesn’t Exist”️️️ 🤷‍♂️🤷‍♂️🤷‍♂️</p>
<p>Duration: ⏱ 9 Mins</p>

I tried passing the array for all the data and then sanitizing it (for 2nd process). But got no luck in that - produced an error. So, if anyone comes up with a better way to reproduce the same customized result, he/she is very much welcome. As for now, this much I have.

P.S.- Thanks to @sarah11918 for pointing me out the possible way to get the desired result.

sarah11918 commented 1 year ago

Yes, I think more guidance on using customData makes sense!

I think it would be great if people want to figure out the most straightforward way to do this so that it's easy for people to follow.

Anyone have thoughts on what we could add to the existing RSS page that would be helpful but not overwhelming? Is anyone doing anything like this with their own RSS feeds? https://docs.astro.build/en/guides/rss/

aritraroy24 commented 1 year ago

Validation of RSS Feed with customData

The Problem

import rss from '@astrojs/rss';
import { getCollection } from 'astro:content';
import sanitizeHtml from 'sanitize-html';
import { marked } from 'marked';
import { formatBlogPosts } from '../../../assets/js/utils'

export async function get(context) {
    const allBlogs = await getCollection("blogs");
    const formattedBlogs = formatBlogPosts(allBlogs);
    return rss({
        xmlns: { atom: "http://www.w3.org/2005/Atom" },
        title: 'Aritra Roy | Blogs',
        description: 'I\'m a Theoretical Computational Chemist & Algorithm Enthusiast from India. If you subscribe to this RSS feed you will receive updates and summaries of my new posts about computational chemistry, programming and stuff.',
        site: context.site,
        author: "Aritra Roy",
        commentsUrl: "https://github.com/aritraroy24/astro-portfolio-comments/discussions",
        source: {
            title: "Aritra Roy | Blog RSS Feed",
            url: "https://invalid-rss-hosted-styled.netlify.app/tutorial/blogs/blog-rss.xml"   // Change the URL to your RSS feed URL, otherwise you will encounter another error at the time of RSS validation
        },
        items: formattedBlogs.map((blog) => ({
            title: blog.data.title,
            description: blog.data.description,
            pubDate: blog.data.pubDate,
            link: `/tutorial/blogs/${blog.slug}`,
            categories: ["Computational", "Chemistry", "Research", "PhD", "Post-doc", "CompChem", "Programming", "Coding", "Technology", "Update"],
            content: `${sanitizeHtml(marked.parse(blog.body))}`,
            customData: `${[sanitizeHtml(marked.parse("Subtitle: " + blog.data.subtitle)) + sanitizeHtml(marked.parse("Duration: " + blog.data.duration)) + sanitizeHtml(marked.parse("Tags: " + blog.data.tags))]}`
        })),
        customData: `<atom:link href="https://invalid-rss-hosted-styled.netlify.app/tutorial/blogs/blog-rss.xml" rel="self" type="application/rss+xml" />`, // change the URL to your RSS feed
        stylesheet: '/rss/blog-rss-styles.xsl',   // or commenting out the style to see raw XML data
    });
}

The above code [consider content & customData parameters of the items mainly] works fine without producing any error at the build time (both styled and unstyled versions) screenshot of the terminal with the build command and using a style.xsl I can see the customised data on the localhost or 'invalid-rss-hosted-styled.netlify.app/tutorial/blogs/blog-rss.xml'. Consider attached screenshot-
Screenshot of the styled RSS feed page of the invalid feedScreenshot of the styled RSS feed page of the invalid feed

However,

when using an RSS validator like W3C Feed Validation Service, it gives an error like below-
Screenshot of the error from W3C Feed Validation Service for styled RSS feedScreenshot of the error from W3C Feed Validation Service for styled RSS feed For the unstyled version also, the validation error is the same-
Screenshot of the error from W3C Feed Validation Service for unstyled RSS feedScreenshot of the error from W3C Feed Validation Service for unstyled RSS feed

Root Of The Problem

If we carefully observe the unstyled RSS feed, we can see the \

...\

tags are outside of the \ element (which is performing similar work to the \ tag in HTML) for each item. Take a look at the marked region [for one element] of the screenshot from the unstyled RSS feed page- Screenshot of the unstyled RSS feed page of the invalid feedScreenshot of the unstyled RSS feed page of the invalid feed

The Solution

It is understandable that anyhow we have to bring the \

...\

tags inside of the \ element. To do so, we can inject our customData by adding those inside the content tag using ES6 template literals. Take a look at the following code [consider the content parameter of the items mainly]-

import rss from '@astrojs/rss';
import { getCollection } from 'astro:content';
import sanitizeHtml from 'sanitize-html';
import { marked } from 'marked';
import { formatBlogPosts } from '../../../assets/js/utils'

export async function get(context) {
    const allBlogs = await getCollection("blogs");
    const formattedBlogs = formatBlogPosts(allBlogs);
    return rss({
        xmlns: { atom: "http://www.w3.org/2005/Atom" },
        title: 'Aritra Roy | Blogs',
        description: 'I\'m a Theoretical Computational Chemist & Algorithm Enthusiast from India. If you subscribe to this RSS feed you will receive updates and summaries of my new posts about computational chemistry, programming and stuff.',
        site: context.site,
        author: "Aritra Roy",
        commentsUrl: "https://github.com/aritraroy24/astro-portfolio-comments/discussions",
        source: {
            title: "Aritra Roy | Blog RSS Feed",
            url: "https://valid-rss-hosted-styled.netlify.app/tutorial/blogs/blog-rss.xml"
        },
        items: formattedBlogs.map((blog) => ({
            title: blog.data.title,
            description: blog.data.description,
            pubDate: blog.data.pubDate,
            link: `/tutorial/blogs/${blog.slug}`,
            categories: ["Computational", "Chemistry", "Research", "PhD", "Post-doc", "CompChem", "Programming", "Coding", "Technology", "Update"],
            content: `${[sanitizeHtml(marked.parse(blog.body)) + sanitizeHtml(marked.parse("Subtitle: " + blog.data.subtitle)) + sanitizeHtml(marked.parse("Duration: " + blog.data.duration)) + sanitizeHtml(marked.parse("Tags: " + blog.data.tags))]}`,
        })),
        customData: `<atom:link href="https://valid-rss-hosted-styled.netlify.app/tutorial/blogs/blog-rss.xml" rel="self" type="application/rss+xml" />`,
        stylesheet: '/rss/blog-rss-styles.xsl',
    });
}

Now if we take a look at the styled RSS feed page that is the same as the previous-
Screenshot of the styled RSS feed page of the valid feedScreenshot of the styled RSS feed page of the valid feed If we check the RSS feed URL using W3C Feed Validation Service for both styled and unstyled RSS feed pages, we get a validation success message like the below- Screenshot of the successful validation message from W3C Feed Validation Service for styled RSS feedScreenshot of the successful validation message from W3C Feed Validation Service for styled RSS feed Screenshot of the successful validation message from W3C Feed Validation Service for unstyled RSS feedScreenshot of the successful validation message from W3C Feed Validation Service for unstyled RSS feed Finally, if we take a look at the unstyled RSS feed page, we can see the \

...\

tags are already inside of the \ element [marked section (for one item) in the following screenshot]- Screenshot of the unstyled RSS feed page of the valid feedScreenshot of the unstyled RSS feed page of the valid feed


This way we can include customData inside the RSS feed with an RSS feed validation.

However, the original question in this issue "Is this the best/recommended way to get all customData for RSS feed?" is still valid.

tony-sull commented 1 year ago

I'm a big fan of RSS feeds, but they sure are complicated to deal with 😅

customData isn't quite as clear of a name as maybe it could be, that property adds custom XML to the feed or feed item. Looking back at the docs examples that's probably more clear on the example of adding a <language> tag to the feed itself, the example for using customData on an individual feed item really doesn't make it as obvious that it should be an XML string.

@aritraroy24 in your example it looks like you need to add extra HTML in the feed that isn't part of the actual markdown content, is that right? The solution you have above for concatenating that into the content property directly is the right approach there, though definitely not the best developer experience with all the template literals and custom markdown parsing required!

Question When you first approached the problem and landed on our docs, was the most confusing part it wasn't clear that customData would be used to inject raw XML directly into the item itself rather than being part of the content that's getting rendered for the item?

aritraroy24 commented 1 year ago

@tony-sull, There are several points here- 1) at first I was trying to use anyVariable instead of using customData, as for most of the cases any customData can be replaced with one's own data. But, it took me a while to understand with RSS, you have to strictly use customData. I thought if we can specify that in the docs, it would be easier for beginners. 2) Discussing if there is any better way to inject those extra HTML data into the feed without using all the template literals and custom markdown parsing. (proposed by @sarah11918 ) 3) I did the second comment two days later of the initial opening of the issue as soon as I realized that I have to put those HTML data inside the content property to have a valid RSS feed. I did the long comment if any person finds it helpful while injecting custom HTML data into the RSS feed.

tony-sull commented 1 year ago

All great info, thanks! We can definitely add more details around what customData and the other properties do and when to use them.

The general use case of using, and potentially modifying, raw HTML in the feed is always an interesting one. It has a few limitations related to how the pages are built and bundled, but maybe there's something we can do there to improve that use case a bit if its pretty common for Astro users to want HTML there 🤔

delucis commented 1 year ago

Sounds like we should clarify customData slightly here.

customData expects a ready-formatted XML string, so we should probably

  1. Remove customData from the content collection example as it’s unlikely you’re writing XML in your content frontmatter
  2. Clarify the wording under the “Generating items” heading to underline that this property expects XML

Happy to receive a PR for this! Or if someone has a blog post for how to set this up in more detail, we’d be happy to link to it as a community recipe.

aritraroy24 commented 1 year ago

@delucis, if you like I can do that. Should I write a blog on the details similar to this post example? Also, I think, we should follow points 1 & 2.

delucis commented 1 year ago

Should I write a blog on the details similar to this post example?

If you like, sure! If it's published somewhere it makes it easy for us to link to.

aritraroy24 commented 1 year ago

Should I write a blog on the details similar to this post example?

If you like, sure! If it's published somewhere it makes it easy for us to link to.

Great. I'll write one tomorrow then 😊.

aritraroy24 commented 1 year ago

@delucis, sorry for taking almost a week. I was busy with my work. However, I have written one today and here is the link - https://aritraroy.live/tutorial/blogs/2023/how-to-send-any-data-to-rss-feed

Please take a look and let me know about your thoughts.

sarah11918 commented 10 months ago

Closing this issue, but if you'd like to PR your blog post to the Community Recipes page, please do! 🚀