project-lux / lux-frontend

Web front end of LUX
Apache License 2.0
3 stars 0 forks source link

"No index" meta tag to disallow search engines (from 1923) #61

Open gigamorph opened 4 months ago

gigamorph commented 4 months ago

Problem Definition

In case we need to block search bots -- e.g. because the load is too much for the system to bear -- updating robots.txt is not enough once the crawl has been initiated. The pages need to have noindex meta tags.

Research

Expected Behavior

We will just put the noindex tag for now, and comment it out when we need to allow indexing and caching of crawled data.

<meta name="robots" content="noindex, nofollow">

Related

gigamorph commented 2 months ago

Noindex and nofollow directives can be inserted programmatically for resources to ask the crawler not to index them anymore. We can apply it if/when we decide we don't want our site to be crawled/indexed anymore.

Research is done. I propose to be close this issue.