Closed a-kyne closed 3 days ago
Hi @a-kyne - what kind of priority/urgency does this issue need, please?
@stevejalim I was just talking Sitemap alternatives with @pmac so I'm going to close this in favor of a different approach.
If I can figure out how to close an issue.
There's this list to exclude from sitemap: https://github.com/mozilla/bedrock/blob/11024c649edcd896d49c519b77d0036d7a66ee71/bedrock/settings/base.py#L478-L520
Then there are robots.txt exclusions: https://github.com/mozilla/bedrock/blob/11024c649edcd896d49c519b77d0036d7a66ee71/bedrock/mozorg/templates/mozorg/robots.txt#L5-L10
And inline noindex
meta tags are in 26 files:
https://github.com/search?q=repo%3Amozilla%2Fbedrock+noindex+language%3AHTML&type=code&l=HTML
(some already covered by the above, some not…)
These do not always overlap 100% though.
The not always overlapping bit is one thing we're trying to solve for. We're looking at potentially removing the sitemap all together as we're not convinced it's helping much with anything, and that way we'd be able to just keep the noindex
tags and have no conflict with anything else.
Description
There are pages listed in the XML Sitemap that have the noindex tag; those URLs should not be included.
It would probably be best if we do not rely on URL exclusion processes that are vulnerable to human error, e.g. manually adding URLs to a “do not include” list.
Steps to reproduce
Expected result
No submitted URLs aren't indexed because they are excluded by a noindex tag.
Actual result
101 URLs cannot be indexed because excluded by noindex tag
Environment
n/a