dfinity / interface-spec

IC Interface Specification
https://khsfq-wqaaa-aaaak-qckvq-cai.icp0.io/docs
37 stars 20 forks source link

feat: Add metrics about subnet usage on the state tree #191

Closed dsarlis closed 1 year ago

dsarlis commented 1 year ago

Motivation

The Internet Computer dashboard exposes various metrics that it retrieves from internal prometheus endpoints, including but not limited to: number of canisters on a subnet, memory used on a subnet, cycles burned, transactions made, etc.

The information shown in the dashboard cannot be verified by 3rd parties as it requires access to internally available only endpoints. As such, people must trust DFINITY who is building the dashboard that they provide correct information. We would like to expose some of these key metrics that show case the usage and capabilities of the IC in a way that is verifiable from external 3rd parties to foster a much more trustworthy ecosystem.

Proposal

The proposed solution is to expose these metrics through the system state tree. The gist of the proposal is to add a path such as:

/subnet/<subnet_id>/metrics

which would return a blob of encoded data capturing the metrics that we would like to expose (could also consider a path per metric but might come with downsides, see my inline comment). This path would be exposed through a new api/v2/subnet/<subnet_id>/read_state endpoint. Having this endpoint allows us to move all other subtree paths under /subnet to be served through this endpoint as well as the /time path.

netlify[bot] commented 1 year ago

Deploy Preview for ic-interface-spec ready!

Name Link
Latest commit 41ae5bf1ea8bef6079332bda0fedf8b6786573ab
Latest deploy log https://app.netlify.com/sites/ic-interface-spec/deploys/64bfe0d591713000082d8ef4
Deploy Preview https://deploy-preview-191--ic-interface-spec.netlify.app
Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

dsarlis commented 1 year ago

This is modelled a lot after https://github.com/dfinity/interface-spec/pull/94. Main difference is that I'm not changing anything about the provisional apis -- does not seem necessary for the metrics case and seemed to be the most contentious point back then.

Dfinity-Bjoern commented 1 year ago

We are discussing in the interface spec meeting: Why not create a new API path /api/v2/subnet/<principal>/read_state?

dsarlis commented 1 year ago

We are discussing in the interface spec meeting: Why not create a new API path /api/v2/subnet/<principal>/read_state?

Why would we? :P It's still a read from the system state tree so it seems natural to me to use the existing API path. That said, I don't feel strongly either way. I suppose if we go with a new API path, then the discussion about effective principal id is moot because presumably the principal is enough to route the request correctly?

dsarlis commented 1 year ago

We are discussing in the interface spec meeting: Why not create a new API path /api/v2/subnet/<principal>/read_state?

Why would we? :P It's still a read from the system state tree so it seems natural to me to use the existing API path. That said, I don't feel strongly either way. I suppose if we go with a new API path, then the discussion about effective principal id is moot because presumably the principal is enough to route the request correctly?

Actually I think it makes more sense this way given the existing API path is api/v2/canister/..., I'll switch it up.

dsarlis commented 1 year ago

@Dfinity-Bjoern @mraszyk I'm re-working the PR to incorporate the new API path /api/v2/subnet/<principal>/read_state. If we go ahead with this, I believe it makes sense to change the state tree paths used to retrieve other subnet specific information. In particular, I'd suggest we change the following

`/subnet/<subnet_id>/public_key`
`/subnet/<subnet_id>/canister_ranges`
`/subnet/<subnet_id>/metrics`

to just

`/public_key`
`/canister_ranges`
`/metrics`

when accessed through /api/v2/subnet/<principal>/read_state and expect that this would be the new approach long term. The reason being that it's redundant if we have the api path contain the subnet id already. The old paths can remain available under the old http api path as well for a while until clients migrate and we can eventually remove them and have them only be accessible through the new api path. Wdyt?

mraszyk commented 1 year ago

@Dfinity-Bjoern @mraszyk I'm re-working the PR to incorporate the new API path /api/v2/subnet/<principal>/read_state. If we go ahead with this, I believe it makes sense to change the state tree paths used to retrieve other subnet specific information. In particular, I'd suggest we change the following

`/subnet/<subnet_id>/public_key`
`/subnet/<subnet_id>/canister_ranges`
`/subnet/<subnet_id>/metrics`

to just

`/public_key`
`/canister_ranges`
`/metrics`

when accessed through /api/v2/subnet/<principal>/read_state and expect that this would be the new approach long term. The reason being that it's redundant if we have the api path contain the subnet id already. The old paths can remain available under the old http api path as well for a while until clients migrate and we can eventually remove them and have them only be accessible through the new api path. Wdyt?

I don't think it'd work unless the subnet signs another certified tree where the leaves are just at the root. Moreover, we should make it possible to browse all subnets (the entire /subnet subtree) without querying the NNS registry canister.

What makes sense (but it's not backward-compatible) is to move the entire /subnet to the new API (with /time being available in both and always included).

CC @derlerd-dfinity

dsarlis commented 1 year ago

I don't think it'd work unless the subnet signs another certified tree where the leaves are just at the root. Moreover, we should make it possible to browse all subnets (the entire /subnet subtree) without querying the NNS registry canister.

Hmm, yeah, good point.

What makes sense (but it's not backward-compatible) is to move the entire /subnet to the new API (with /time being available in both and always included).

Why can't we serve the entire /subnet through both APIs for a while to allow transition and then stop serving from the old one at some point?

mraszyk commented 1 year ago

Why can't we serve the entire /subnet through both APIs for a while to allow transition and then stop serving from the old one at some point?

Sure, we can do that (and maybe we just do that indefinitely).

netlify[bot] commented 1 year ago

Deploy Preview for ic-interface-spec ready!

Name Link
Latest commit 68d301ab313b0a200800709736cfa441e1d6e4b6
Latest deploy log https://app.netlify.com/sites/ic-interface-spec/deploys/64db36b31e18cd00087e82c9
Deploy Preview https://deploy-preview-191--ic-interface-spec.netlify.app
Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

github-actions[bot] commented 1 year ago

🤖 Here's your preview: https://pjpim-riaaa-aaaak-qckja-cai.icp0.io/docs