ukwa / ukwa-services

Deployment configuration for all UKWA services stacks.
Apache License 2.0
4 stars 5 forks source link

Make it clearer where in a chain of proxies things are going wrong. #100

Closed anjackson closed 1 year ago

anjackson commented 1 year ago

As in ukwa/ukwa-pywb#94 it can be hard to tell where in a process things are going wrong. It would be good to know which proxy is throwing an error.

Adding custom 50x pages for each layer (with the layer named in the footer/signature) sounded like a good idea, but in practice the proxy closest to the client would replace the earlier response information.

A more useful approach would be to add some kind of header at each layer, so the headers from the deeper layers should disappear when there's a problem upstream and the proxy can't handle it.

This page suggest adding a header that records the upstream, e.g.

add_header  X-Upstream  $upstream_addr;

which will add the header when the response is not an error, or

add_header  X-Upstream  $upstream_addr always;

which always adds the header.

Perhaps just adding any identifier this way will work, as add_header does not work on errors by default in general. Therefore, if we add something like one of...

add_header 'Via' 'api.bl.uk';
add_header 'Via' $hostname;

That should make it clear. e..g if we are www > api > stack and the stack NGINX works fine, but api fails, then there will be a single Via: stack header. If www fails, there will be two Via headers.

anjackson commented 1 year ago

e.g. adding the $hostname to the website stack NGINX, and setting up a templated hostname, I can get:

Via: access_website_nginx-1

So, this seems like a good thing to roll out, as it will make debugging easier. Can use ukwa/ukwa-pywb#94 as a test case.

anjackson commented 1 year ago

Okay, added to wa-www-beta and bapi-lb1/2 (although forgot to commit to GitLab so changes were not getting propagated to bapi-lb2 until I fixed that). The Via headers now show both steps. When rolling out the updated website, I can check the updates.

anjackson commented 1 year ago

Updating Beta without deploying the fixed ukwa-pywb...

Via: access_website_nginx-1
Via: bapi-lb2.n45.wa.bl.uk
Via: wa-www-beta

So we can see the error going down the stack.

anjackson commented 1 year ago

So, now updating BETA so the UWSGI fix is there... Yep now talking directly to the access stack NGINX works, but at the front we see:

Via: bapi-lb2.n45.wa.bl.uk
Via: wa-www-beta

So we know lb2 had an issue...

anjackson commented 1 year ago

Updated buffer size on bapi and now directly accessing that works, and from the front we get just

Via: wa-www-beta
anjackson commented 1 year ago

Finally, updating proxy_buffer_size 256k; on wa-www-beta works, showing the full chain:

Via: access_website_nginx-1
Via: bapi-lb2.n45.wa.bl.uk
Via: wa-www-beta
anjackson commented 1 year ago

Now seeing

Via: access_website_nginx-1
Via: api-lb1.n45.wa.bl.uk
Via: wa-www.n45.bl.uk

on PROD, so we're good.