c4software / python-sitemap

Mini website crawler to make sitemap from a website.
GNU General Public License v3.0
362 stars 110 forks source link

Add support for sitemap index #65

Closed jswilson closed 3 years ago

jswilson commented 4 years ago

As first mentioned in this issue, sitemaps over 50,000 URLs should be split into multiple sitemap files, with a single master index file pointing to all of the sitemap files. This PR adds support for outputting a sitemap index and multiple sitemap files.

Garrett-R commented 4 years ago

I personally like your choice of requiring including the --as-index flag, but curious to hear @c4software's take on that.

One idea is that if the sitemap ends up being 50K+ links, then at the end of the run, if they ran it without --as-index, you could output a note to the terminal with a link to some educational resource on why they should consider breaking it into multiple sitemaps with --as-index.

c4software commented 4 years ago

I'm quite busy right now.

I will also take a look at it this week.

Sorry for the delay.

c4software commented 3 years ago

Seems great for me. @Garrett-R Did you also confirm ?

Garrett-R commented 3 years ago

Yup, looks good to me, and I've used this branch already.