Automated costing - Githubissues

kozross commented 2 years ago

Describe the feature you'd like

According to the information provided here and here, the current workflow for costing a new builtin is as follows:

Add benches for said builtin in plutus-core/cost-model. This step is unavoidably manual, but can't really be any other way.
Run cabal bench plutus-core:cost-model-budgeting-bench. This step is automatic, but extremely slow; my attempt here took close to 12 hours to complete on very nontrivially capable hardware.
Run cabal bench plutus-core:update-cost-model. This step is automatic, but with some caveats, as the linear regression being used might produce certain 'correct but nonsensical' outcomes, such as negative cost coefficients.
Rebuild to generate costing functions. This step is automatic.
Run cabal bench plutus-core:cost-model-test to verify that our costing makes sense. This step is automatic (enough).
Run more benchmarks for larger programs. This step is not automatic, but there's no reason why it couldn't be.

At MLabs, we believe it is important that costing reflects optimizations, ideally without requiring too much manual intervention. The current system does contain significant automation, but still requires not only considerable manual intervention, but isn't terribly practical, as evidenced by my attempt described above. It's not reasonable to run something this time-consuming as an automatic job every time someone (attempts to) optimize a primitive, or indeed, does something that might cause a primitive to regress, and require a more pessimistic costing. In an ideal world, this would be part of the CI: specifically, if we have a costing regression, CI would blow up.

Lastly, the 'reference machine' is a serious problem, as it 'ties' costing to a specific combination of hardware and OS. Additionally, it makes it impossible for contributors to do costing, although this is a stated future goal here. In fact, I already ran afoul of this!

Describe alternatives you've considered

This is a tricky intersection of problems. On the one hand, in order to run the process of costing as part of CI, it would need to be significantly faster than it currently is. However, it's not clear how this would be possible: it would require investigation as to how the longest stage could be optimized. Furthermore, the question of the 'reference machine' needs to be addressed before we can do this.

One easy way to avoid the problem of a reference machine are 'yardstick primitives'. We essentially select some trusted primitives to act as a 'reference measure', or 'definition' of 1 and n, then measure everything by comparison to these. So thus, a cost would not be an absolute value: instead, it would be a function of the costs of one or more 'yardstick primitives'. This would solve the reference machine problem at least.

michaelpj commented 2 years ago

Yes, we would like to make this more automated, but it's a matter of attention. In practice, the builtins don't change very often, so it's not the end of the world that it's quite manual.

In particular, making this easy for external contributors would be nice, but I suspect would be a lot of effort.

effectfully commented 1 year ago

I think this is considered low priority by IOG right now and isn't being worked on actively, so I'm adding such a label. @kwxm please correct me if I'm wrong.

kwxm commented 1 year ago

We do want to do this eventually, partly because the process really will have to be accessible to the whole community as we move towards decentralisation. We have an issue for this in JIRA, but unfortunately we haven't had the time to work on it yet. We're conscious that we need to do something though.

IntersectMBO / plutus

Automated costing #4384

Describe the feature you'd like

Describe alternatives you've considered