Closed jgreat closed 1 year ago
That error is for view, but same issue, here's a similar error for ledger:
2023-03-23 21:46:07.754230192 UTC ERRO Fog Ledger retrieving BlockResponse from missed block ranges error: grpcio error (fog-ledger://fog.test.mobilecoin.com:443/): RpcFailure: 14-UNAVAILABLE Received http2 header with status: 503, mc.app: test_client, mc.module: mc_fog_sample_paykit::cached_tx_data, mc.src: fog/sample-paykit/src/cached_tx_data/mod.rs:487
I am looking at this now, but I'm confused what I'm supposed to do.
For all of these connection types, when grpc gives us an rpc failure, we should see that here:
That returns from AttestedConnection::attested_call
here: https://github.com/mobilecoinfoundation/mobilecoin/blob/e7e1c59fac408a36675d268094e85e0d03ad78a0/connection/src/traits.rs#L61
which should call self.deattest()
which was defined here:
So why are the cookies not getting reset? What am I missing?
I guess we can add logging when deattest is called
James gave some good feedback in core-eng: It's possible that the cookies get set but the attest_cipher does not, and if we get in such a bad state, we will never clear cookies.
We also think we could add unit tests in the fog-enclave-connection
crates since it's already generic over the grpc part
This may have been related to k8s service/routing changes. Where we inadvertently were routing to fog-view/ledger instances that were not ready. We're past the change where that routing problem will be an issue. I'm going to close this issue and will open a new one if we have more trouble or have more data.
I found that fog-test-client in canary mode seems to keep trying to connect to the same ledger endpoint even though it received a 503 error from the ingress controller.
I think the proper behavior should be to clear all cookies on an error response.
This is also a bug in the ingress controller configuration, it should re-route to an available ledger server when the sticky server isn't available any longer.