Add static cost analysis

hugocaillard commented 1 year ago

Goal

Statically compute the costs of a given snippet in a contract. The idea is to be able to display the cost a function directly in the code editor.

Notes

This idea has already been considered in the stacks-blockchain repo (https://github.com/stacks-network/stacks-blockchain/issues/1185), so it should be done a way that the code could be moved to this repo.

Implementation

We can only compute the worse case scenario of the cost of a function. For function like print-input below, we would always consider that the argument is a string with a length of 16.

(define-public (print-input (input (string-ascii 16)))
  (ok (print input))
)

In case there are multiple code path, we can compute the costs for all of them. The function print-and-maybe-save below has two paths because of the if

(define-data-var count int 0)

(define-public (add-and-maybe-save (save bool) (a int) (b int))
  (let ((new-count (+ a b)))
    (if save
      (begin
        (var-set count new-count)
        (ok new-count))
      (ok new-count))))

Its static cost analysis would look like (max count and percentage omitted for brevity)

[
    // save == true branch
    CostAnalysis {
        costs: ExecutionCost {
            runtime: 1260000,
            read_count: 4,
            read_length: 1114,
            write_count: 1,
            write_length: 17,
        },
        span: [Span(4, 1, 6, 12), Span(7, 7, 9, 23)]
    },
    // save == false branch
    CostAnalysis {
        costs: ExecutionCost {
            runtime: 1222000,
            read_count: 3,
            read_length: 1114,
            write_count: 0,
            write_length: 0,
        },
        span: [Span(4, 1, 6, 12), Span(10, 7, 10, 20)]
    }
];

Note: this is a bit simplified: in reality, the costs data structure should be recursive (with a new node for every code branch), might be more realistic for complex exemples

sabbyanandan commented 1 year ago

@hugocaillard: A few (noob) questions re: static analysis.

a: How close the computed outcome would be in comparison to the actual execution cost? I suppose it depends on many aspects (eg: current network fee rate & congestion); curious if we know what the difference would be so we can do the right thing to highlight that in our tooling & docs.

b: If we push this component upstream, I assume we would reuse and wrap that same logic in the clarinet test --cost command and within Clarity Insights through the Clarity extension for VS Code. How about other products? From a consistency standpoint, it would be beneficial to harmonize this feature across all touchpoints if possible (eg: API).

hugocaillard commented 1 year ago

@sabbyanandan Good questions!

a.

So this feature is about the following costs metrics: runtime, read_count, read_length, write_count, write_length. And not about the network fees. So network fee rate and congestion have no impact on it (let me know if that doesn't make sense, I can elaborate).
How close the computed outcome would be in comparison to the actual execution?

It kind of depends of the code we analyze. Since this is a static analyze (the code is not executed), we can only compute "best case" and "word case scenarios". So we take a contract with the given function:

(define-read-only (print-a (a (string-ascii 16)))
  (print a)
)

The best case scenario is to call (print-a "") (where a is an empty-string, runtime cost is 1760.
The worst case scenario is to call (print-a "0123456789abcdef") (where a has the max length of 16), runtime cost is 2032.

So until now, we only have have tools (api, clarinet, etc) that compute the cost of a given transaction. This feature is about estimating the worst (and best?) case scenarios of a function (or any snippet).

Of course, thing get a bit more complicated when the function has multiple code path (due to if, match, asserts!...), but the principle still applies.

b. As I said above, this feature is not exactly the same as what we already do with clarinet costs or the API estimations (execution costs vs static analysis). But I agree that:

it would be nice if the API and Clarinet used the same code to compute execution costs (not super easy because one is TS and the other in Rust), still doable
the static analysis costs logic could be reused in the explorer, clarinet and other tools 👍

obycode commented 1 year ago

Are you planning to implement this with an AST traversal or a simulated execution?

hugocaillard commented 1 year ago

@obycode the current idea is AST traversal. Do you think simulated execution is more realistic?

obycode commented 1 year ago

No, I agree, AST traversal is the better option, as that will allow you to easily explore all paths through the code, and even estimate costs for code that doesn't run. It will be a bit of a pain, since you'll need to translate all of the places where cost is added during execution over from the VM. The read/write count/length estimates will be easy, but the runtime values will be more tricky.

hugocaillard commented 1 year ago

@obycode

It will be a bit of a pain

Yeah I'm starting to see that 😶

since you'll need to translate all of the places where cost is added during execution over from the VM

You mean to adapt the code from the VM?

the runtime values will be more tricky

Can you give a bit more details about it? I think I kind of get why it'll be complicated but I probably don't have the big picture

obycode commented 1 year ago

It's because the runtime costs include more than just the basic execution -- things like type-checking, lookups, etc.

hirosystems / clarinet