IntersectMBO / plutus

The Plutus language implementation and tools
Apache License 2.0
1.54k stars 468 forks source link

Why does the `txInfoMint` `Value` contain `0` lovelace? #5039

Open christianschmitz opened 1 year ago

christianschmitz commented 1 year ago

I noticed that when cardano-cli creates the ScriptContext to calculate the execution budget, each Value instance will always contain some lovelace, even the txInfoMint Value.

Of course the lovelace entry in txInfoMint will always be 0, so I was quite surprised to see that the 0 lovelace entry is always prepended to txInfoMint when creating the ScriptContext (was a particularly difficult to debug issue for the Helios execution budget calculator).

Always including this is also seems wasteful, as minting policy contracts will usually do some iterating over all the entries in the txInfoMint map.

Another reason not to include it is that it would be nice for the index of the asset being minted in txInfoMint to correspond directly with the Minting redeemer index.

So my two questions:

michaelpj commented 1 year ago

There are generally no guarantees about the particular representation of Value so long as it is semantically correct. An entry being missing and mapping to zero are semantically the same and so you should support both.

https://github.com/input-output-hk/plutus/issues/4509

christianschmitz commented 1 year ago

Javascript libraries are however dependent on APIs like Blockfrost to submit transactions, and if those APIs decide to use a version of cardano-cli that always includes 0 lovelace in txInfoMint, it changes the execution budget. (Note: the execution budget must be calculated before sending the transaction to the API, so the Javascript library must do exactly what the API does).

So in this case I could say that Blockfrost/cardano-cli must support 'both' ways, and somehow must detect that the minting script was originally executed without 0 lovelace in the txInfoMint value, and afterwards the cardano-node must again support 'both' ways by doing the same detection.

I think it really makes more sense to enforce normalization of any Value in ScriptContext to avoid this issue.

michaelpj commented 1 year ago

I'm sorry, I don't understand your objection. A system that claims to calculate the budget for scripts must behave the same as the Cardano node, otherwise it is wrong. It sounds like you've found a bug in whatever Blockfrost is using.

christianschmitz commented 1 year ago

Blockfrost uses cardano-cli to check the execution budget. During that check cardano-cli prepends 0 lovelace to the minted Value. This is obviously also what cardano-node does, so this isn't a bug.

But doing this does incur unnecessary execution budget. Think of this as low-hanging fruit for optimization.

effectfully commented 1 year ago

@christianschmitz

But doing this does incur unnecessary execution budget.

Do you happen to know how much the additional cost is?

christianschmitz commented 1 year ago

I've seen about a 10% hit on cpu and mem budget for typical minting scripts.

effectfully commented 1 year ago

@christianschmitz

I've seen about a 10% hit on cpu and mem budget for typical minting scripts.

That's a lot of percent...

You write:

if those APIs decide to use a version of cardano-cli that always includes 0 lovelace in txInfoMint

Are there some versions of cardano-cli that include 0 lovelace in txInfoMint and some that don't? Do you happen to know what the current behavior of cardano-cli is?

@michaelpj

An entry being missing and mapping to zero are semantically the same and so you should support both.

This sounds like an argument in favor of normalizing the Value like @christianschmitz suggests. Why not remove some noise and reduce the cost of executing a script at the same time?

michaelpj commented 1 year ago

I am more convinced by the performance argument indeed. I'm surprised at the numbers, though, and I'd like to see some benchmarks.

Perversely, I almost appreciate the current behaviour because it forces people to properly account for the fact that values might not be normalized. If we start normalizing them then we will likely have to commit to always being careful to do so, since people will start to assume it.

effectfully commented 1 year ago

I'm surprised at the numbers, though, and I'd like to see some benchmarks.

@christianschmitz would you be interested in providing us with such benchmarks?

Perversely, I almost appreciate the current behaviour because it forces people to properly account for the fact that values might not be normalized. If we start normalizing them then we will likely have to commit to always being careful to do so, since people will start to assume it.

Also a very good point. But if we were to leave some 0s in there, we could as well only leave one and strip the rest in the name of performance gods.

christianschmitz commented 1 year ago

The following issue was caused by not prepending the 0 lovelace value to txMintedInfo when calculating the script budget using the Helios-SDK: https://github.com/Hyperion-BT/helios/issues/34

The following two files can be used as benchmarks to compare a normalized txMintedInfo to one containing the 0 lovelace value:

The difference is:

effectfully commented 1 year ago

Thank you very much @christianschmitz, this is extremely helpful.

The difference is: cpu: 367239733 - 341066723 = 26173010 (7.5%) mem: 1005266 - 937405 = 67861 (7.2%)

@michaelpj these are great numbers, I think the @christianschmitz's proposal is very worth a consideration.

However, would the sums-of-products thing incorporate these changes? I.e. is it the parsing that is so expensive here? Or is it something else?

christianschmitz commented 1 year ago

txMintedInfo is essentially a list, and in the first case the list contains 2 entries, in the second case only 1 entry.

So I don't think the sums-of-products CIP will influence this.

effectfully commented 1 year ago

@michaelpj have you by any chance spent some more time thinking about this issue? Maybe, you know, in the evening, when you sit with the family, but something deeply unsettling is disturbing your soul and makes you eventually stand up and shout "screw it! What about the zero lovelace issue?".

Anyways, for now I'm going to mark this issue as "low priority", since it's what we've been doing for all optimization tickets.

michaelpj commented 1 year ago

I have not been thinking about it and tbh I don't think it's going to make it to the top of the list soon.

JaredCorduan commented 1 year ago

@christianschmitz Thank you for bring this up. This would actually be a ledger change, as ledger is responsible for populating the script context.

We cannot change the behavior for V1 or V2, but I myself am in favor of no longer supplying the zero ada starting in V3. My only fear is that script authors might continue to make the assumption that ADA is always present. I actually have a PR up now for this change to V3 in the next ledger era, and I plan on making this change.

ch1bo commented 1 year ago

The Hydra team also stumbled over this. We are +1 for making this an empty value instead.

colll78 commented 1 year ago

+1 for making this an empty value.

Right now, the behavior is completely unintuitive, and introduces a ton of issues in development. You would expect to be able to check that nothing is minted with txInfoMint info == mempty. If you still allow prepending the 0 lovelace value to txInfoMint (but do not require it) then it maintains backwards compatibility.

I don't think this is just an optimization issue, it is a developer experience issue.

michaelpj commented 1 year ago

You would expect to be able to check that nothing is minted with txInfoMint info == mempty.

You can do this... if you have a correct equality function!

colll78 commented 1 year ago

With a correct equality function this should return false, because it is not actually the case the the value is empty. You can do this if you have a weird hacky equality function that checks that the normalized value is empty. Introducing arbitrary abstraction like that always leads to problems down the road because the user don't actually understand what is going on and what the equality function is doing.