Open mjangda opened 7 years ago
Sitemaps have to be regenerated when the template changes, no way around that. The trick here is to figure out how to do that efficently.
Right now the flow is as follows (simplistic version)
Here's what I'm proposing
we create a list of posts (basically just their urls) that need to be included in that sitemap
What happens if the URL structure changes? How do we get other related data like the post modified date from just the URL?
Alot of the tags never change, so we could hard code them in the template vs using SimpleXML (ie: loc, lastmod, etc..)
Do we get any major benefits from switching to a hard-coded template? How will we maintain backwards compatibility (e.g. some filters pass in the simplexml object that sites use to add things like images)?
Performance wise I think we're gonna net out close to the same, but this method I'm proposing may be a bit more expensive
This is probably the biggest thing we'll need to watch. Some of the sites using this plugin have millions of posts dating back 5/10/20 years. If the newer method is significantly slower, it may not be worth it so it would be good to gather and compare some data as we work on this.
Hey Mo,
A few others asked the some of the same questions you did on the internal site.
For the urls, that's just an example, we'll just need to make sure we store the right data needed, and in the post_meta, the the post_content
Good catch about the backwards compatibility on the SimpleXML objects, I'm going to take a look at that.
Thanks
Right now, sitemap XML is generated async and stored in the database to allow them to be served super quickly. The downside is that any code changes that modify the XML output means all sitemaps need to be re-generated which can be a very slow, time-consuming process on really large sites with thousands of sitemaps.
We should explore alternate ways to handle this (while maintaining backwards compat with existing actions/filters) and evaluate whether those approaches make sense.