noisebridge / infrastructure

The Noisebridge Infrastucture
GNU General Public License v3.0
28 stars 19 forks source link

Update robots.txt #338

Closed jof closed 1 year ago

jof commented 1 year ago

Add some additional aliases for the 86 page. Set Noindex: as well as Disallow:

jof commented 1 year ago

I went to diff the/check the last PR to add this file into Ansible, and I found a robots.txt already existing with this content.

I think these maybe these other paths used to map to that page, but we've since tightened up our redirect game.

As for Noindex:, it seems like a bit of a distraction; I can't imagine a use case where we would want to allow a robot, but not allow indexing.

The other larger question I'm looking at is why Caddy isn't serving out robots.txt at all, and instead just returns an HTTP/200 with Content-length: 0.

I'll plan to drop the Noindex: lines.

hzeller commented 1 year ago

I also do remember there was a robots.txt that disallowed 86 in the past, and it was served on the noisebridge wiki. Maybe something broke in the meantime that makes the robots.txt not surface ?

hzeller commented 1 year ago

Last capture with an intact robots.txt on archive.org was on March 14th, 2021

https://web.archive.org/web/20210314182224/https://www.noisebridge.net/robots.txt

Maybe that helps narrowing down the time when it broke.

jof commented 1 year ago

Nice find with the wayback machine.

I think it probably broke somewhere when we switched to using Caddy as a server. The configuration was missing a file_server directive, which prevented the static file hosting functionality. I updated the PR with some configuration that seems like it should work for us.

jof commented 1 year ago

Sounds good. Shouldn't there be something that makes the static robots.txt file appear ? I have no idea about caddy configuration, but hide *.php does not necessarily sound like it.

Confusingly, it seems very permissive by default; the presence of the file_server statement along with the root will serve files if it can find them. I just don't want to accidentally ever serve out anything php as source, so I'm thinking I should try and block that from ever happening.