"No index" meta tag to disallow search engines (from 1923)

Problem Definition

In case we need to block search bots -- e.g. because the load is too much for the system to bear -- updating robots.txt is not enough once the crawl has been initiated. The pages need to have noindex meta tags.

Research

[x] Can it be implemented to be turned on and off dynamically? - It seems we cannot expect dynamically updated tag to be picked up reliably by search bots. It should rather be statically included in the HTML file.
[x] How to resume crawling after the no-index tag has been indexed/processed by google? - Remove the noindex tag and submit the page again.

Expected Behavior

We will just put the noindex tag for now, and comment it out when we need to allow indexing and caching of crawled data.

<meta name="robots" content="noindex, nofollow">

This issue is a prerequisite for https://git.yale.edu/lux-its/lux-web/issues/1687

project-lux / lux-frontend

"No index" meta tag to disallow search engines (from 1923) #61

Problem Definition

Research

Expected Behavior

Related