Agoric / agoric-sdk

monorepo for the Agoric Javascript smart contract platform
Apache License 2.0
327 stars 208 forks source link

Telemetry for Zoe / ERTP #4586

Open Tartuffo opened 2 years ago

Tartuffo commented 2 years ago

Per our Observability meeting on Feb 17, had some areas of Zoe / ERTP to instrument. May 4 meeting we this one revised to the following (separate from https://github.com/Agoric/agoric-sdk/issues/4585 )

Data consumption use case is devops, making sure things are working as expected. Consumer is AgoricOpCo and validators.

This ticket isn't per-node telemetry. It's for what results from consensus. But TBD whether that result goes back onto chain or some new archival log.

We have logging now to a Prometheus logger in SwingSet. Gauge and counters. We currently then send the data up to data up to GCP StackDriver for monitoring.

Plan:

Gauges and counters

Global

Per contract

Per issuer

Tartuffo commented 2 years ago

Hi @turadg - since you are picking up the telemetry, this is probably best for you to take on than @arirubinstein , who doesn't know much about the contract space.

turadg commented 2 years ago

I've taken up on-chain metrics that contracts might read and act on. Some of the bullet points above fit that , such as:

To support that they have to be emitted using updaters from the Notifier API. However many of these should not be on-chain. For example,

If you're asking me to take that on, I can do that but it's quite different from the on-chain metrics in https://github.com/Agoric/agoric-sdk/issues/4639

For estimation, is there a mechanism yet for emitting or consuming such data? cc @arirubinstein

Tartuffo commented 2 years ago

@turadg If we take these items out of this ticket (and presumably into a different one to be done by someone else), does it become sensible / tractable?

  1. Are the non-used-up payments getting GCd? Would indicate a (bad) bug.
  2. GC stats generally
  3. How many times has XSNAP organically done its GC
turadg commented 2 years ago

Yes, it does. Though I'd be more inclined to create a new ticket without the "telemetry" term which I think we agreed would be reserved for off-chain data.

So something like:

Title: On-chain metrics of Zoe and ERTP

Provide new notifiers for Zoe/ERTP state:
- global
- per contract
- per issuer

The state must include counts of:

Global
- contract Installations
- invitations (“kind of” 1:1 with seats) [made? open? closed?]
- seats, reallocations (including number of seats in the reallocation), exits
- all issuers (“should be” small, but NFTs might change that)
- issuers for which there is an escrow purse, with outstanding balances
- allocations per seat (lifetime of residency of the seat)
- ZCF mints (contracts can create new issuers)

Per contract
- instances by contract

Per issuer
- Payments - right now we only clean up payments via GC.  There should be lots of payments with short lifetimes, which might stress GC.
- Purses

Is that right?

Before implementing I'd like:

turadg commented 2 years ago

Discussion above resolved in a meeting and captured in updated issue body.