interledger / rafiki

An open-source, comprehensive Interledger service for wallet providers, enabling them to provide Interledger functionality to their users.
https://rafiki.dev/
Apache License 2.0
234 stars 83 forks source link

Track count of prepare, fulfill, and reject packets #2734

Closed mkurapov closed 1 month ago

mkurapov commented 4 months ago

As part of Telemetry V2, we want to add counts, and amount for the number of prepare, fulfill, and reject packets.

These new metrics are:

All of these metrics will be collected at the "receiving" connector, meaning, a count is updated only after it has "reached" the connector that receives that particular type of packet. We should track all packets (we shouldn't, and really, can't, differentiate between packets for quoting and actual transfers).

Example 1:

image

After this, the counts would look like:

packet_count_prepare = 2 packet_count_fulfill = 2 packet_amount_fulfill = amount of fulfill 1 + amount of fulfill 2

Example 2:

image

After this, the counts would look like:

packet_count_prepare = 2 packet_count_fulfill = 1 packet_count_reject = 1 packet_amount_fulfill = amount of fulfill

JoblersTune commented 2 months ago

@mkurapov

  1. I believe we need to collect the number of prepare packets as they arrive at a node and the number of fulfill and reject responses that same node is sending back. I.e. I don't think we want to collect the number of fulfill or reject responses that are received at a node, we want to collect the number of fulfill or reject responses that a node sends. The reason for this is that if you see your example above, if A keeps sending multiple requests to C, but only C has telemetry enabled we're gonna be collecting high numbers of prepares but absolutely no fulfill or reject packet counts.
  2. In the scheme above, where we collect incoming prepares and outgoing fulfills/rejects then the collectTelemetryAmount metric is already correctly positioned, so that it aligns with all occassions where we send a fulfill response. So for this bullet point: "packet_amount_fulfill (treat the amount as we do now: we need to convert it to a base currency, see the current collectTelemetryAmount function for this)" Are you just asking me to change the name of the metric to packet_amount_fulfill? Because all of this logic is already at the packet level.
mkurapov commented 2 months ago

@JoblersTune

  1. I think that makes sense - keeping the sending & receiving of packets for telemetry "self-contained" within a node/connector. One of the metrics we wanted to track was network loss, meaning we could calculate how many packets were lost by doing Number of packets - (2 * (rejected + fulfilled)) = “network loss”, but as long as we still collecting the fulfill/reject count increase after the HTTP request (in general, at the end of the middleware chain(, we should still get the same behaviour.

  2. Yes, I think it makes sense to just rename the metric. The tricky part right now is to figure out the placement of it - currently, I think the way that it's placed the metric is collected twice during a fulfill packet - once in the receiving connector and another time in the sending connector.

mkurapov commented 2 months ago

@JoblersTune

Just wanted to add, when you are adding packet_amount_fulfill, make sure to also add source as an attribute on the counter metric, (as we aren't currently doing this for transactions_amount).

JoblersTune commented 2 months ago

Working on capturing these metrics in a more numeric stat based dashboard that's easy to read image