Open gigamorph opened 4 months ago
Noindex and nofollow directives can be inserted programmatically for resources to ask the crawler not to index them anymore. We can apply it if/when we decide we don't want our site to be crawled/indexed anymore.
Research is done. I propose to be close this issue.
Problem Definition
In case we need to block search bots -- e.g. because the load is too much for the system to bear -- updating robots.txt is not enough once the crawl has been initiated. The pages need to have noindex meta tags.
Research
Expected Behavior
We will just put the noindex tag for now, and comment it out when we need to allow indexing and caching of crawled data.
Related