wellcomecollection / platform-infrastructure

:building_construction: Infrastructure for the Wellcome Digital Platform
MIT License
24 stars 8 forks source link

Enable WAF blocking for IIIF apis #420

Closed kenoir closed 7 months ago

kenoir commented 7 months ago

What's changing and why?

This change follows https://github.com/wellcomecollection/platform-infrastructure/pull/418, and enables blocking of requests where they match specified WAF rules.

See the result of count operations revealing bot activity (specifically bytespider):

Screenshot 2024-02-14 at 13 13 53
jamieparkinson commented 7 months ago

There is an interesting issue with this: it blocks some requests for tokens for restricted IIIF resources, for example https://iiif.wellcomecollection.org/auth/token?messageId=1&origin=http://localhost. This doesn't break real behaviour (where the origin is a public domain) but breaks end-to-end tests.

It appears to be that anything with the localhost substring gets blocked by EC2MetaDataSSRF_QUERYARGUMENTS. I've just manually overridden this specific rule for the prod distribution to confirm that it resolves the issue - we either want to do that (easy) or add a whitelist-type rule at a higher priority for this specific request (slightly harder but arguably more betterer).

kenoir commented 7 months ago

Is there any sense in centralising our WAF config somewhere?

Yes, where sensible. I'm tempted to keep as much of it in the platform-infrastructure repository as we can, but minded that the wellcomecollection.org cloudfront distro (where the other WAF is) is an outlier in that it contains its own distro.

I think we should wait for a 3rd instance of a WAF to materialise before consolidating though.