Closed dantheta closed 6 years ago
Unfortunately not so straightforward. We check the robots.txt from the server once before sending the request to the probes. The probes themselves take the server's word for it and run the test.
OK. I suppose in any case a robots.txt might be replicated at a blocked URL for some reason and fool us as we are not comparing page content.
Thought on this: from our perspective, a "no index" result is the same as a "not blocked" result. Couldn't we display these sites as both "no index" and "not blocked"? (And simply not index the content as requested?)