Closed RCheesley closed 5 years ago
I have tried to set the X-Robots via htaccess but it's not working, so I've set it in a robots.txt via a plugin. Also verified the domain in Google Search Console.
Ultimately we should fix the x-robots issue and set it so that it checks the domain name - eg. if staging.mautic.org then set it to not index etc, but if www then index (maybe restrict certain file types if we need to).
I think it may need a glance from @dbhurley to double check server setup?
What about robots.txt? http://staging.mautic.org/robots.txt I see should disallow, but I find a lot of results in google.
I only just set disallow yesterday. It had been indexed for ages.
We need to set the xrobots header on the staging site really, robots isn't the best way to do it.
I'll follow this one up with David.
On Tue, 12 Mar 2019, 08:37 Zdeno Kuzmany, notifications@github.com wrote:
What about robots.txt? http://staging.mautic.org/robots.txt I see should disallow, but I find a lot of results in google.
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/mautic/website/issues/61#issuecomment-471904475, or mute the thread https://github.com/notifications/unsubscribe-auth/ACy3oRHzouKHx9MXuHfxdrlrHpBs5rodks5vV2c3gaJpZM4boY6b .
Perfect. Good to know it.
While this is completed it makes more sense to block at the server-level for any staging/dev sites. I will close this issue and set a secondary issue for following up.
Need to prevent the staging site from being indexed by search engines.
Ideally should use an X-Robots tag, but we should have a robots.txt as a fallback until that can be resolved.
The site needs to be de-indexed so we need to add it to Search Console for ongoing management.