mitodl / ol-infrastructure

Infrastructure automation code for use by MIT Open Learning
BSD 3-Clause "New" or "Revised" License
47 stars 4 forks source link

robots.txt files for draft.ocw.mit.edu, live-qa and draft-qa servers #991

Open pdpinch opened 2 years ago

pdpinch commented 2 years ago

User Story

Users don't want to get search results from google for live-qa.ocw.mit.edu or any of the other qa and draft servers

Acceptance Criteria

blarghmatey commented 1 year ago

@pdpinch is this issue still needed, or has it been completed?

pdpinch commented 1 year ago

Still needed.

https://draft.ocw.mit.edu/robots.txt returns 404 https://draft-qa.ocw.mit.edu/robots.txt returns 404 https://live-qa.ocw.mit.edu/robots.txt returns 404

https://ocw.mit.edu/robots.txt also returns 404, which is OK although perhaps not ideal.

Wassaf-Shahzad commented 1 year ago

@pdpinch @blarghmatey Would you kindly guide me a bit on where extacly to add the robot.txt, are there any reference PRs I could take a look ?

pdpinch commented 1 year ago

@blarghmatey I'm pretty sure that https://ocw.mit.edu/robots.txt worked at some point, but it's returning a 404 now. Did we ever have the vcl code committed for creating these robots.txt responses?

Is fastly vcl still the right place to do this?

blarghmatey commented 1 year ago

The fastly VCL is still the right place. @shaidar can help point you at the right place to make the changes.

blarghmatey commented 1 year ago

This is resolved now by https://github.com/mitodl/ol-infrastructure/commit/0d9724450ba6b7f62124ec5370186d75289c7fae

The TL;DR is that the wrong condition was being attached to the robots.txt response so it was never triggered.

blarghmatey commented 1 year ago

That commit didn't end up resolving the issue as expected. The reason the robots.txt isn't being loaded is because of errors in the logic for how Pulumi/Terraform maps the request conditions to the synthetic responses. This will likely require pulling some of that logic directly into VCL instead of relying on the cache conditions and response object parameters in the ServiceVCL definition.