Closed jof closed 1 year ago
I went to diff the/check the last PR to add this file into Ansible, and I found a robots.txt
already existing with this content.
I think these maybe these other paths used to map to that page, but we've since tightened up our redirect game.
As for Noindex:
, it seems like a bit of a distraction; I can't imagine a use case where we would want to allow a robot, but not allow indexing.
The other larger question I'm looking at is why Caddy isn't serving out robots.txt at all, and instead just returns an HTTP/200 with Content-length: 0
.
I'll plan to drop the Noindex:
lines.
I also do remember there was a robots.txt that disallowed 86 in the past, and it was served on the noisebridge wiki. Maybe something broke in the meantime that makes the robots.txt not surface ?
Last capture with an intact robots.txt on archive.org was on March 14th, 2021
https://web.archive.org/web/20210314182224/https://www.noisebridge.net/robots.txt
Maybe that helps narrowing down the time when it broke.
Nice find with the wayback machine.
I think it probably broke somewhere when we switched to using Caddy as a server. The configuration was missing a file_server directive, which prevented the static file hosting functionality. I updated the PR with some configuration that seems like it should work for us.
Sounds good. Shouldn't there be something that makes the static robots.txt file appear ? I have no idea about caddy configuration, but
hide *.php
does not necessarily sound like it.
Confusingly, it seems very permissive by default; the presence of the file_server
statement along with the root
will serve files if it can find them. I just don't want to accidentally ever serve out anything php as source, so I'm thinking I should try and block that from ever happening.
Add some additional aliases for the 86 page. Set Noindex: as well as Disallow: