Closed patrick-ogrady closed 1 year ago
We'll need to make sure to add this support for the GET calls as well. Load balancers typically just look for a 200 response, so jsonrpc doesn't work well for them (which is why we added the special GET handling)
Need to add docs for that, then I will close.
Added to docs: https://docs.avax.network/apis/avalanchego/apis/health#filtering
There is still a pending PR that would filter min connected health checks with subnetIDs: #1358
Although one Subnet on an AvalancheGo node may be unhealthy, operators may still wish to interact with other Subnets running on it. AvalancheGo's existing health check, however, returns unhealthy if any Subnet is unhealthy. This behavior led to an outage in Subnet APIs during this incident even though most Subnets were able to serve queries because API providers prevented a node serving queries if this "global" check failed (as that was the only mechanism they had to gauge health of the underlying node).
We should add a new health check or add an argument to the existing check (https://docs.avax.network/apis/avalanchego/apis/health#healthhealth) that allows for just checking the health of a specific Subnet. This will allow API providers to serve queries to any subset of healthy Subnets on a node.
I don't think we should remove the "global" health check in this change (which still is useful for getting a "full sense" of a node's status).