canonical / ubuntu.com

The official website for the Ubuntu operating system
https://ubuntu.com
Other
191 stars 189 forks source link

Prevent search indexing of staging server #67

Closed WillMoggridge closed 8 years ago

WillMoggridge commented 8 years ago

Look into methods to prevent search bots indexing the staging site.

A good initial solution would be to add the robots.txt Host directive. https://en.wikipedia.org/wiki/Robots_exclusion_standard#Host

nottrobin commented 8 years ago

There are two other possible options which would involve using a different Apache configuration for the staging server:

nottrobin commented 8 years ago

I am uneasy about using a different Apache configuration for staging, because the point of the staging server is to catch errors on Production, and these errors may well be introduced in the Apache configuration. If we continued to use the same configuration rile, but used conditionals to override some settings for staging, this would be better.

I'm also worried about putting the whole staging server behind SSO unless there's a good reason, because this will inevitably be a significant difference in how the site is hosted. It will also mean that we can't curl the staging server, or do other scripting to test the staging server, without some very in-depth authentication work.

nottrobin commented 8 years ago

I just thought of another option, which may be the winner. We could ask IS to update the firewall so the staging IPs are only accessible from within the VPN. This would mean anyone who wants to look at the staging server will need VPN access, but I guess this shouldn't be too much of an overhead?

nottrobin commented 8 years ago

Make all staging environments private

nottrobin commented 8 years ago

This is now being solved by updating the firewall to restrict the IP internally: https://portal.admin.canonical.com/89283. As it's not related to this project specifically and is being tracked in Trello, I'm going to close this issue now.