Closed kfogel closed 9 years ago
See this article about the Robots exclusion standard: http://en.wikipedia.org/wiki/Robots_exclusion_standard.
Essentially, the robots.txt file instructs web robots/crawlers not to look at certain pages or directories. It is only an advisory measure (not prescriptive).
I looked at the commit. We need the text for the robots.txt file itself, though :-). That is, the fix to this issue is actually creating a robots.txt
in the top level of the TTM tree, and making sure it gets served correctly when "/robots.txt" is requested by a client.
Sorry! The first commit on that branch adds the file. I split it into two commits because I wanted to edit the markup for the install file on github. So the first commit on the branch adds the file, and the second one is the edits for the INSTALL.md file :)
Oh! Thanks.
The commit messages should reference the relevant issue number(s) -- that's what threw me :-). (That's a pretty important general principle everywhere: the bidirectional link between commit and issue is key for making things reviewable, and for forensic analysis when necessary.)
Meg, would you like to merge this to master or have @kfogel do it? Either way is fine.
Pulled this to production today and it seems to be working fine. I did not make the changes to the apache config file recommended in INSTALL.md.
We don't have a robots.txt file yet. I suspect that's why we see things like, e.g., this in /var/log/apache2/ttm_error.log:
We should have a standard robots.txt, obviously.