foundry-rs / foundry

Foundry is a blazing fast, portable and modular toolkit for Ethereum application development written in Rust.
https://getfoundry.sh
Apache License 2.0
8.1k stars 1.67k forks source link

feat(forge): Add internal metrics capability #3607

Open lucas-manuel opened 1 year ago

lucas-manuel commented 1 year ago

Component

Forge

Describe the feature you would like

When running invariant tests with an actor-based pattern, there is currently a lack of visibility for:

  1. How many times a function is called
  2. What code paths within the function are reached
  3. Other general metrics (e.g., how many LPs deposit total)

Having a cheatcode that would allow storing this information with arbitrary keys would be useful to be able to render this info in a summary table at the end of a fuzzing campaign.

@gakonst mentioned [prometheus] metrics as a good reference for this.

Examples:

vm.counter("numberOfLps", numberOfLps++);

vm.counter(string.concat("loan_", vm.toString(loanNumber), "_payments"), payments[loan]);

Additional context

No response

gakonst commented 1 year ago

cc @onbjerg @mattsse I'm still a bit abstract on this but I was thinking of exposing an API via cheatcodes similar to https://docs.rs/metrics/latest/metrics/, and on the CLI we would collect all these metrics and custom log them in a table

gakonst commented 1 year ago

@lucas-manuel can you give an example of how your ideal reporting would look like? a table? something else? maybe there should be plots like this https://docs.rs/tui/latest/tui/widgets/struct.Chart.html for stuff on how it evolves over time (e.g. price as volatility happens)? makes me think that this may allow creating automatic simulation / stress test reports like Gauntlet Network does.

lucas-manuel commented 1 year ago

@gakonst Yeah personally I think the lowest hanging fruit would be to log in a table similar to --gas-report. I'd mainly be using this for debugging purposes so it'd be useful as a numerical log output.

Going forward though for the more sophisticated use cases we discussed it would be interesting to export to some sort of JSON that could be used to generate more visual reporting (could be added to CI as an artifact for example).

gakonst commented 1 year ago

@FrankieIsLost @transmissions11 had some thoughts on this which I'd love if they shared in the thread :)

transmissions11 commented 1 year ago

Great idea to give users programmable insight into their invariant campaigns, been a big advocate of this for a small while now.

IMO giving devs a better understanding of the coverage of their invariant tests and tools to effectively debug them is far more valuable than building smarter fuzzers after a certain point, because once a weakness is identified its not too hard to guide the fuzzer towards it, as opposed to a genius fuzzer thats a total black box from a dev's perspective, which offers little insight into how secure a piece of code is and whether a dev can be confident in the coverage of the run. Humans and fuzzers should work tandem!

In terms of actual design:

WDYT?

horsefacts commented 1 year ago

In addition to these more complex metrics, it would be very helpful to see a breakdown of calls/reverts by target contract + selector. For example, something like:

╭───────────────────────────────────────────┬─────────╮─────────╮
│ test/actors/Swapper.sol:Swapper contract  ┆         │         │
╞═══════════════════════════════════════════╪═════════╡═════════╡
│ Function Name                             ┆ calls   │ reverts │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┤╌╌╌╌╌╌╌╌╌┤
│ swapGooForGobblers                        ┆ 1024    │ 612     │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┤╌╌╌╌╌╌╌╌╌┤
│ swapGobblersForGoo                        ┆ 1024    │ 12      │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┤╌╌╌╌╌╌╌╌╌┤
│ swapGobblersForGobblers                   ┆ 1024    │ 64      │
╰───────────────────────────────────────────┴─────────╯─────────╯

This would be very helpful for writing and debugging new actor contracts.

lucas-manuel commented 1 year ago

@transmissions11

Love that Gearbox chart 👀

Yeah I agree that it would be better to design this without the need to persist storage within the contracts anywhere for this purpose. I like the suggestions of incrementCounter and setCounter. I also like the idea of a postCall hook, we've been hacking that together with modifiers inside the actors contracts but would definitely be better to have an official framework for it haha.

What are our next steps here?

I also completely agree with the summary table idea @horsefacts

odyslam commented 1 year ago

Second gakonst for using Prometheus as the engine/standard and then we can either:

PaulRBerg commented 1 year ago

@lucas-manuel Until this issue gets implemented, do you know any way to output the call summary only at the end of the invariant tests run?

I saw that in your invariants-example repo, you have defined this function:

function invariant_call_summary() external view {
    console.log("\nCall Summary\n");
    ///
}

But, if I understand this correctly, this test function will be executed at the end of each run, which will slow down test runs.

Is there any way to output the call summary at the end of the invariant tests run? As ar as I know, there is no "after all" hook in Foundry (a function opposite to setUp), but maybe there is another way?

gakonst commented 1 year ago

But, if I understand this correctly, this test function will be executed at the end of each run, which will slow down test runs.

Yeah it'll slow it down, but a bit only I would assume. We should probably add that kind of function invariant_summary() hook as an "after all".

PaulRBerg commented 1 year ago

It depends on how many console.log statements there are .. we have quite a few and the run time is noticeably slower with the invariant summary function defined. Ended up renaming it to start with something other than invariant_ so that we can selectively activate it when we want to get a report.

hook as an "after all"

Would be super helpful.

grandizzy commented 5 months ago

considering @transmissions11 comment -

I think a nice and easy way to have such metrics is by using OpenTelemetry (OTLP) open source standard for logs, metrics, and traces as it already provides crates to facilitate such integration, see rust bindings and crates

A big pro of this is that we can easily integrate with forge and support not only Prometheus but other tools at the same time by

Didn't put too much thoughts on UX but at a first call there'll be

lmk what you guys think about this approach, think a PoC with such can be done quite easy but wouldn't spend time on it if not of interest. thanks

grandizzy commented 5 months ago

I made a quick PoC, see https://github.com/grandizzy/foundry/tree/metrics/crates/metrics/examples#metrics-demo for reference (Code adds a metrics crate, there's no config yet, cheatcodes not in their own metrics group and better UX needed - for a quick view of code changes pls see https://github.com/grandizzy/foundry/commit/919e84b29dc757ad40dd773baf86193a9b2ef5a9) The PoC use OTEL collector to record metrics simultaneously in a file and by sending them to a local carbon endpoint, then showed in Grafana dashboard

grafana

List of exporters that can be used by otel config file can be found here

I see three use cases / dimensions that can be accomplished by having such

  1. Basic, understand how campaigns are performing

    • case 1: dumping metrics in file for campaign review
    • case 2: writing to a backend and visualize campaign statistics (carbon/Grafana, Prometheus, others, see exporters)
  2. Actions performed based on collected test metrics:

    • case 3: integrate in a CI pipeline and stop/restart campaign if it violates certain thresholds (number of unique values less than threshold, number of reverts greater than threshold, times a selector was hit less/greater than threshold). Ex: export to kafka, AWS Kinesis, others, see exporters
  3. Adapt fuzzing campaigns based on collected metrics:

    • if PR https://github.com/foundry-rs/foundry/pull/7428 accepted it will introduce the concept of test fixtures / data sets used in fuzzing, in first phase as inline forge-config: fixture config
    • next step for fixtures would be to have cheatcodes for loading test data sets from file/URL
    • then metrics can be used to provide custom input for fuzzing campaigns

For example: metrics are collected and exported to AWS Kinesis, or Apache Kafka, etc. then campaign metrics are processed and

  1. saved in persistent layer (S3, etc) for historical analysis
  2. campaign metrics are used to generate new datasets that will be used by further campaigns. Rules to apply could be to ensure values are unique across several campaigns or repetable to some extent or new datasets derived from already used values, etc.

any feedback appreciated, thank you

grandizzy commented 5 months ago

@Evalir we can continue discussion here, here a quick overview of what the changes to have such metrics would be https://github.com/foundry-rs/foundry/compare/master...grandizzy:foundry:metrics see also prev comments re sample and other use cases this could be used at. Wanted to get your thoughts first before polishing code and issuing a PR as it introduce some new deps (if not comfortable with due to supply chain attack events, it can be build / cfged on demand). If all good I can do the PR + update foundry book with how can be used / examples.

grandizzy commented 5 months ago

Adding on OpenTelemetry adoption / usage - Grafana just announced their open source collector (Grafana Alloy) https://grafana.com/blog/2024/04/09/grafana-alloy-opentelemetry-collector-with-prometheus-pipelines/