mobilecoinfoundation / mobilecoin

Private payments for mobile devices.
Other
1.16k stars 148 forks source link

fog-test-client - clear sticky cookies on error response. #3280

Closed jgreat closed 1 year ago

jgreat commented 1 year ago

I found that fog-test-client in canary mode seems to keep trying to connect to the same ledger endpoint even though it received a 503 error from the ingress controller.

2023-03-23 21:47:07.855452859 UTC ERRO Fog view protocol error: Connection error: Fog view connection error (fog-view://fog.test.mobilecoin.com:443/): gRPC Error: RpcFailure: 14-UNAVAILABLE Received http2 header with status: 503, mc.app: test_client, mc.module: mc_fog_sample_paykit::cached_tx_data, mc.src: fog/sample-paykit/src/cached_tx_data/mod.rs:450

I think the proper behavior should be to clear all cookies on an error response.

This is also a bug in the ingress controller configuration, it should re-route to an available ledger server when the sticky server isn't available any longer.

jgreat commented 1 year ago

That error is for view, but same issue, here's a similar error for ledger:

2023-03-23 21:46:07.754230192 UTC ERRO Fog Ledger retrieving BlockResponse from missed block ranges error: grpcio error (fog-ledger://fog.test.mobilecoin.com:443/): RpcFailure: 14-UNAVAILABLE Received http2 header with status: 503, mc.app: test_client, mc.module: mc_fog_sample_paykit::cached_tx_data, mc.src: fog/sample-paykit/src/cached_tx_data/mod.rs:487
cbeck88 commented 1 year ago

I am looking at this now, but I'm confused what I'm supposed to do.

For all of these connection types, when grpc gives us an rpc failure, we should see that here:

https://github.com/mobilecoinfoundation/mobilecoin/blob/e7e1c59fac408a36675d268094e85e0d03ad78a0/fog/enclave_connection/src/lib.rs#L223

That returns from AttestedConnection::attested_call here: https://github.com/mobilecoinfoundation/mobilecoin/blob/e7e1c59fac408a36675d268094e85e0d03ad78a0/connection/src/traits.rs#L61

which should call self.deattest()

which was defined here:

https://github.com/mobilecoinfoundation/mobilecoin/blob/e7e1c59fac408a36675d268094e85e0d03ad78a0/fog/enclave_connection/src/lib.rs#L137

So why are the cookies not getting reset? What am I missing?

cbeck88 commented 1 year ago

I guess we can add logging when deattest is called

cbeck88 commented 1 year ago

James gave some good feedback in core-eng: It's possible that the cookies get set but the attest_cipher does not, and if we get in such a bad state, we will never clear cookies.

We also think we could add unit tests in the fog-enclave-connection crates since it's already generic over the grpc part

jgreat commented 1 year ago

This may have been related to k8s service/routing changes. Where we inadvertently were routing to fog-view/ledger instances that were not ready. We're past the change where that routing problem will be an issue. I'm going to close this issue and will open a new one if we have more trouble or have more data.