Open thescientist13 opened 6 months ago
For now a couple ways to implement this manually could be to:
greenwood build
step, read the contents of graph.json in the output directory and generate the fileFor 2, would it be a copy plugin? ie, the plugin would generate a temporary file, then pass
{
from: tempPath,
to: new URL(`sitemap.xml`, outputDir)
}
@jstockdi
Greenwood should automatically generate a graph.json file for you, that will be available in the output directory after running greenwood build
(it's technically there too during development in the .greenwood/ tmp folder)
So after running greenwood build
, a simple Node script should suffice
// sitemap-gen.js
import fs from 'fs';
import graph from './public/graph.json' with { type: 'json'};
const urls = graph.map((page) => {
return `
<url>
<loc>http://www.example.com${page.route}</loc>
</url>
`
}).join('\n');
fs.writeFileSync('./public/sitemap.xml', `
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
${urls}
</urlset>
`);
# after running Greenwood build, or add to your npm scripts...
$ node sitemap-gen.js
edit: sorry, I think you were referencing option 1, in which case yes, a copy plugin would do the trick, e.g.
function myCopySitemapPlugin() {
return {
type: 'copy',
name: 'plugin-copy-sitemap',
provider: (compilation) => {
const filename = 'sitemap.xml';
const { userWorkspace, outputDir } = compilation.context;
return [{
from: new URL('./${filename}', userWorkspace),
to: new URL('./${filename}', outputDir)
}];
}
};
}
Otherwise, to generate dynamically for now, the above script sample should also work. 🎯
Actually, I was thinking use a copy plugin...
Read the graph, write a dynamic file to scratch, then copy to final.
const greenwoodPluginSitemap = [{
type: 'copy',
name: 'plugin-copy-sitemap',
provider: async (compilation) => {
const { outputDir, scratchDir } = compilation.context;
const urls = graph.map((page) => {
return `
<url>
<loc>http://www.example.com${page.route}</loc>
</url>
`
}).join('\n');
const sitemapFromUrl = new URL(`./sitemap.xml`, scratchDir)
fs.writeFileSync(
sitemapFromUrl, `
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
${urls}
</urlset>
`);
const assets = [];
assets.push({
from: sitemapFromUrl,
to: new URL(`./${fileName}`, outputDir)
});
return assets;
}
}];
So for the two different options here from a contributing perspective, here are my initial thoughts
For a static sitemap in the root workspace folder, e.g. src/sitemap.xml it should just be as simple as following one of the existing "copy" based features / plugins, like our robots.txt plugin https://github.com/ProjectEvergreen/greenwood/blob/master/packages/cli/src/plugins/copy/plugin-copy-robots.js
As for supporting a dynamic flavor of this, e.g. src/sitemap.xml.js I'm not sure I have an idea on the best way to instrument this off the top of my head, mainly for handling development vs production workflows which are slightly different.
For development, we could make a resource plugin that resource plugin that has a serve lifecycle that checks if the dynamic flavor exists in shouldServe
and then the serve function would be something like this?
async function shouldServe(url) {
return url.pathname.endsWith('sitemap.xml.js')
}
async function serve(url) {
const { generateSitemap } = (await import(url)).then(module => module);
const sitemap = await generateSitemap(this.compilation);
return new Response(sitemap, { headers: { 'Content-Type': 'text/xml' });
}
For production, we could probably just run that similar logic in serve (except just outputting a file instead of returning a Response
object) in the bundle
command.
Greenwood tests are basically black box tests, You can create an exact version of any greenwood project + config, run the CLI, and just the output, in either case, that a sitemap.xml file is generated in the output folder. https://github.com/ProjectEvergreen/greenwood/tree/master/packages/cli/test/cases
We would probably want on test case for each of static and dynamic sitemaps
I think for now the best place to document these would probably be in the Styles and Assets page
Summary
Called out in our Slack channel, but Greenwood should definitely have some support for sitemaps, which are an XML file used to tell Search Engines about the content and pages contained within a site, in particular for larger sites and / or where links between pages are maybe not as consistent. https://developers.google.com/search/docs/crawling-indexing/sitemaps/overview
Here is a basic example https://developers.google.com/search/docs/crawling-indexing/sitemaps/build-sitemap
Details
I think the approach used in Next.js is probably good enough for Greenwood supporting either of this options
Dynamic File, e.g. sitemap.xml.js - will be provided a copy of the greenwood graph and be expected to return valid XML
Might want to wait until after #955 is merged since we might want to piggy back off any solutions there re: extending the ability for pages to be more than just markdown (.md) or JavaScript (.js).