alan-turing-institute / data-safe-haven

https://data-safe-haven.readthedocs.io
BSD 3-Clause "New" or "Revised" License
61 stars 15 forks source link

Replace SRE Azure Firewall with cheaper alternative #1626

Open jemrobinson opened 1 year ago

jemrobinson commented 1 year ago

:white_check_mark: Checklist

:strawberry: Suggested change

The Azure Firewall is moderately expensive (~£200/month) and we're only using basic filtering capabilities. We should look at alternatives.

:steam_locomotive: How could this be done?

We should look into replacing this with a Squid Proxy - @manics might be able to give some advice on how TREEHOOSE do this.

manics commented 1 year ago

Our Ansible roles for configuring our EC2 proxy VMs are open-source: https://github.com/hic-infra/shared-services-ansible-roles/

We're in the process of open-sourcing the Terraform for deploying the EC2 instances and everything else for our shared services, though at some point we may switch to containerising everything.

jemrobinson commented 6 months ago

Config examples here:

https://blog.thinkbox.dev/posts/0009-domain-filter-with-squid/

https://wiki.squid-cache.org/SquidFaq/SquidAcl

https://xebia.com/blog/how-to-configure-squid-as-an-egress-gateway/

https://jasonpangazure.medium.com/how-to-use-azure-firewall-and-squid-as-virtual-appliance-in-azure-route-table-to-overwrite-debc98b8f0b8

JimMadge commented 6 months ago

That looks very nice,

manics commented 6 months ago

If you're writing your own Ansible role you might be able to copy/fork https://github.com/hic-infra/shared-services-ansible-roles/tree/main/squid

jemrobinson commented 6 months ago

It looks like getting Squid to work with HTTPS is complicated (see e.g. https://dev.to/suntong/squid-proxy-and-ssl-interception-1oa4) and is likely to involve installing a self-signed certificate on all resources that need to make HTTPS connections. I don't think this will work with calls from Dockerised services.

JimMadge commented 6 months ago

HTTPS in general is tricky. Can't use the simple cert verification challenges if you have no internet access. Can't reach cert authorities to check certs with no internet access.

HTTPS inside would be nice to have. I don't have much experience with self-signed certs. I think a lot of programs reject or raise warnings about them. It might be difficult to add and trust the certs everywhere they are needed.

jemrobinson commented 6 months ago

This is less about accessing e.g. https://gitea.<my sre>.com inside the environment but more about accessing https://login.microsoftonline.com for user authentication.

JimMadge commented 6 months ago

In both cases do you get the problem of, the squid server doesn't have a valid cert for the domain you are requesting. It would look like a MITM attack.

jemrobinson commented 6 months ago

I mean, it is a MITM attack. The proxy is essentially unwrapping an HTTPS request to find its destination, deciding whether or not to forward it on, making a new request, getting the result of that request and sending it back to the original client.

JimMadge commented 6 months ago

But the way we propose it is a friendly attack :smile:.

HSTS might also make it difficult to do in a browser. The browser will know certain sites should always be served over HTTPS.

jemrobinson commented 6 months ago

NB. Azure Firewall does this by resolving FQDNs to a list of IP addresses every 15 seconds (https://learn.microsoft.com/en-us/azure/firewall/fqdn-filtering-network-rules#how-it-works). Could be a way forward if we're happy to write some code to do that?