Closed jswilson closed 3 years ago
I personally like your choice of requiring including the --as-index
flag, but curious to hear @c4software's take on that.
One idea is that if the sitemap ends up being 50K+ links, then at the end of the run, if they ran it without --as-index
, you could output a note to the terminal with a link to some educational resource on why they should consider breaking it into multiple sitemaps with --as-index
.
I'm quite busy right now.
I will also take a look at it this week.
Sorry for the delay.
Seems great for me. @Garrett-R Did you also confirm ?
Yup, looks good to me, and I've used this branch already.
As first mentioned in this issue, sitemaps over 50,000 URLs should be split into multiple sitemap files, with a single master index file pointing to all of the sitemap files. This PR adds support for outputting a sitemap index and multiple sitemap files.
Right now,
--output
is not a required parameter; but outputting an index and multiple sitemap files without writing them to files isn't quite sensible. The index contains pointers to files, so what would be the contents of the index when no output file is specified? Because of this, I'm currently requiring--output
when using the new--as-index
flag.In order to output an index, you have to include the
--as-index
flag. If you don't include the--as-index
flag, then the sitemap will be written to a single file, even if there are more than 50,000 URLs. My thinking was this would maintain backward compatibility; presumably everyone using the library right now is happy with the output so why change it? Another possibility would be to take this away as an option and just always write an index if there are more than 50,000 URLs, since this would be in line with the specification. If we do go with this, we would likely need to make --output required due to the first bullet described above.