edgelesssys / contrast

Deploy and manage confidential containers on Kubernetes
https://docs.edgeless.systems/contrast
GNU Affero General Public License v3.0
160 stars 6 forks source link

coordinator: uniform gRPC metric prefix #583

Closed burgerdev closed 1 week ago

burgerdev commented 2 weeks ago

During a refactoring, we discovered that the gRPC metrics include the service in both the metric name and a label:

contrast_userapi_grpc_server_handling_seconds_bucket{grpc_method="GetManifests",grpc_service="userapi.UserAPI",grpc_type="unary",le="0.01"} 1

Since it's both more consistent to report all gRPC methods under the same metric, and easier to maintain the gRPC metrics when there is only one instance, we decided to move all of them under a common prefix, contrast_.

Future list of metrics contrast_attestation_failures 8 contrast_coordinator_manifest_generation 1 contrast_grpc_server_handled_total contrast_grpc_server_handling_seconds_bucket contrast_grpc_server_handling_seconds_count contrast_grpc_server_handling_seconds_sum contrast_grpc_server_msg_received_total contrast_grpc_server_msg_sent_total contrast_grpc_server_started_total

Below is the original PR description with the motivation for refactoring userapi etc.


This PR moves the gRPC handlers for userapi and recoveryapi into the authority module. In the main module remain gRPC server setup (credentials, metrics, listener).

I'm mostly driven by the increasing complexity of changing code in both authority and userapi. We had to change the Authority API a couple of times, and adjusting the userapi test accordingly is cumbersome. Given that the Authority API is closely coupled to the gRPC API, it is probably for the best to merge the two and remove the current Authority API.

To keep things simple, I'm refactoring in several steps. This is the first step, which does the bare minimum to keep the boat afloat after the move. There are still some rough edges (I left a couple of TODOs), but it already allows to resolve this and that TODO, which are blocking RFC004 - the former here and the latter in a PoC commit [here].

Next, I need to consolidate the former Authority API with the userapi, which will mostly involve moving code around, and adapt the tests. Fortunately, these changes can now be interleaved with the recovery effort.

Review Recommendation

I split the PR into

It is probably easier to restrict the Changes from in the review tab to the second commit, so that you only see the actual changes. Otherwise, git does not consider the files related and the diffs are the entire files.

TODO:

github-actions[bot] commented 1 week ago

PR Preview Action v1.4.7 :---: Preview removed because the pull request was closed. 2024-06-19 06:23 UTC