Postleaf / postleaf

Simple, beautiful publishing with Node.js.
https://www.postleaf.org/
MIT License
504 stars 204 forks source link

Support for Sitemaps #42

Closed juan-manuel-alberro closed 7 years ago

juan-manuel-alberro commented 7 years ago

Hello all,

It's a general question instead of a bug, is Posteaf generating the sitemap.xml file or there's an endpoint for that e.x. /sitemap ?

If don't, any plans to add it into the alpha version? I believe this is a very important feature since Postleaf is SEO friendly.

Thanks.

claviska commented 7 years ago

Nothing in core yet.

Is the sitemap something we can generate automatically? What does an ideal sitemap contain? (All pages, just the nav, other?) Is it auto-discovered or does each page have to reference it somehow?

I don't use them, so pardon the questions. If I can get a consensus on how to implement one then I'll jump on it, but if implementation is subjective it may be better to wait for plugins.

lukewatts commented 7 years ago

For the time being you can use sitemap generators such as XML Sitemaps

I would say this is best left to plugins so SEO specialists can implement this functionality and maintain it separately. SEO is a ever changing so maintaining changes in core could be a pain. Just my opinion

juan-manuel-alberro commented 7 years ago

IMHO a sitemap generation is a core functionality in every CMS. We can implement something like this https://www.npmjs.com/package/sitemap, it will create a new endpoint for the sitemap and that's it.

claviska commented 7 years ago

I'm not worried about how to implement it on the technical side. I need to know what a sitemap should look like for 99% of sites to determine if it really belongs in core.

If there's a standardized, unopinionated way to create them, let's do it. If implementation varies wildly based on individual preference, it probably belongs in a plugin.

I'm not going to get into an argument over SEO best practices here. If that's the case, the feature is likely too opinionated to add to core. Right now, Postleaf already supports JSON-LD, OpenGraph, and Twitter Cards and none of my sites have trouble indexing things. 🤷‍♂️

claviska commented 7 years ago

Ghost's implementation looks like this: https://blog.ghost.org/sitemap.xml

They're using sitemap.xml as an index with the following groups:

sitemap-pages.xml
sitemap-posts.xml
sitemap-authors.xml
sitemap-tags.xml

What's better for a Postleaf website? All posts/pages in sitemap.xml or multiple sitemaps like mentioned above?

juan-manuel-alberro commented 7 years ago

I thinks there's no a better approach over this, many Wordpress plugins manage a single sitemap_index.xml with other sitemaps inside like Ghost. I like this implementation and the Protocol doesn't say anything about better practices.

claviska commented 7 years ago

Is there a benefit to separate the sitemap into multiple categories like I mentioned above, or would a single sitemap be fine? Does it affect how search engines index the site at all?

juan-manuel-alberro commented 7 years ago

I believe there's no benefit to split the sitemaps in different files except the one that it's easier to read for humans. Here's an example of a full sitemap with everything in the same file.

karsasmus commented 7 years ago

Max. entries of a single sitemap.xml: 50k Max. size of a single sitemap.xml: 10 MB It's possible to gzip a sitemap, so the max. size is also 10 MB.

It's possible to have one sitemap.xml which links to other sitemap.xml files. So it's possible to have 50k x 50k entries.

Hope that helps a little bit.

claviska commented 7 years ago

Thanks, @karsasmus.

As of 6449441, Postleaf generates a single-page sitemap automatically. The sitemap is accessible at /sitemap.xml and rendered via /source/views/sitemap.dust. I've also added a reference to it in robots.dust (used to render robots.txt).

Like robots.dust, this template doesn't have to exist in your theme, but you can override it by including one. This gives users the ability to customize their sitemap if the default isn't acceptable for some reason.

Pages included in the sitemap (with priority):

I don't see many users hitting 50,000 posts or the XML file being larger than 10MB. It's probably inevitable that someone will, but that edge case can be handled down the road.