Validating the next data script

j-mueller commented 5 years ago

This is one of two problems related to contracts that span multiple transactions. By "contract" I mean a set of contract endpoints that spend and produce outputs to some script address. By "span multiple transactions" I mean that the contract results in at least one transaction that consumes outputs from and produces outputs to the contract address (this is not true for the crowdfunding campaign).

The problem

We can think of the contract as a state machine, with states of type S and inputs of type I. The state is kept in the data script, and the input is supplied as the redeemer script. The transition function t :: S -> I -> S is the validator (throwing an error if the transition is not allowed).

In order for this work, we need a transaction tx that spends an output from the script address (this output determines t and the current state s :: S) using a redeemer with the input i :: I. tx also produces a new output to the script address, with a data script s' :: S such that s' = t s i. How can the validator script ensure that the new data script is actually t s i, and not some other state? If it can't, then the contract can be manipulated very easily, by providing a different data script. This is the "validating the next data script" problem.

It is a problem because inside the validator script we only know t s i :: S and hash(s') :: ByteString, so we're either missing the actual s' :: S in non-hashed form, or a way to compute hash(t s i) :: ByteString.

Possible solutions

Implement a hashing function inside PlutusTx.
Allow the validator script to look at the values of the data scripts of tx.
Change the encoding of the state machine so that the validator gets hash(t s i) from the slot leader.

For (1), we need to make sure that we are computing the correct hash. The hash(s') that the validator script can see is really the hash of the serialised PLC term s', including the (program ...) tag. So the thing that's actually necessary in Plutus is a function serialiseS :: S -> ByteString that produces exactly the same output. We could potentially generate serialiseS using Template Haskell, similar to makeLift.

For (2) we need to consider that we don't know the type of the data scripts produced by tx in the validator script (they may all be of different types). So we would need something like the Data.Dynamic module.

For (3) we can change the signature of the validator function to v :: (S, I) -> (S, I) -> (). (Note that the data and redeemer scripts have the same type now). If we assume that the first element of the first argument is the current state, the first element of the second argument is the new state, and the second element of the second argument is the input then we can transform the old state machine into this format: v (currentState, _) (newState, input) = if t currentState Input == newState then checkHashes else error where checkHashes compares the hashes of the redeemer and the new data script (both hashes are available as part of the tx data supplied by the slot leader). This works already, without requiring any changes to the language. But it means that every (state, input) pair is put on the chain twice.

j-mueller commented 5 years ago

Going to close this as (3) works for now, and contract ownership can be implemented using a single-token currency or deserialisation.

michaelpj commented 5 years ago

I'd like to keep this open, I'm not really happy with (3) and I think it's generally going to be important for the validator to have access to the data script of the spending transaction.

effectfully commented 5 years ago

In order for this work, we need a transaction tx that spends an output from the script address (this output determines t and the current state s :: S) using a redeemer with the input i :: I. tx also produces a new output to the script address, with a data script s' :: S such that s' = t s i. How can the validator script ensure that the new data script is actually t s i, and not some other state? If it can't, then the contract can be manipulated very easily, by providing a different data script. This is the "validating the next data script" problem.

I've been thinking a while about this and I still do not understand the current design. We have a "black-box" transaction that outputs something that we need to verify even though we exactly know what it's supposed to be. Why? Can we change something and not produce a data script in the transaction, so that we don't need to verify it and can just compute the next state using the redeemer?

nau commented 5 years ago

We are computing the next state in validator using the redeemer. But we can not pass this computed state into the next validator script other than via data script. And the validator script can not produce the output data script, it can only verify the transaction already contains a valid (equal to computed) data script.

effectfully commented 5 years ago

@nau, thanks!

And the validator script can not produce the output data script, it can only verify the transaction already contains a valid (equal to computed) data script.

Is there any way to change the current design, so that it makes sense to allow the validator script to produce the output data script? I understand it's not what we have right now, but can we change that somehow?

nau commented 5 years ago

Won't we get issues with closures then? Imaging a produced data script containing a function that encloses some values from the validator. We would need to store the whole context. Is this feasible in our case?

j-mueller commented 5 years ago

Is there any way to change the current design, so that it makes sense to allow the validator script to produce the output data script?

Yeah we could change the return value of the validator from Bool to whatever the data script is.

But I agree @nau about closures, we would have to replay all validator scripts from the beginning of the contract whenever we want to know the current state.

effectfully commented 5 years ago

@nau,

Won't we get issues with closures then? Imaging a produced data script containing a function that encloses some values from the validator. We would need to store the whole context. Is this feasible in our case?

I assumed, our data scripts are always closed right now? So we can just ban data scripts that reference some internal state of the validator. Or we can allow that and automatically let-bind the values (which essentially makes the data script closed), if it's necessary for some reason, and let the user decide whether it's feasible or not in their particular case.

But I agree @nau about closures, we would have to replay all validator scripts from the beginning of the contract whenever we want to know the current state.

We already have to replay all the events of the blockchain in order to know its current state. As long as computing the state of the validator happens off-chain, it seems to align well with other parts of the system.

But can we allow the validator to somehow publish its data script? Like trigger a separate kind of transactions, I don't know.

I very well understand that I'm probably asking completely nonsensical questions here, I just feel really weird about verifying that what you got is what you already have and so I think it's natural to ask whether there is any way to avoid this by changing the design of the system.

Also, if the data script contains functions, how are we going to check equality of them?

nau commented 5 years ago

What if a data script is hard to calculate but easy to validate? With current design it's the job of a transaction producer to do all the heavy computations, otherwise every node/wallet would have to replicate it.

effectfully commented 5 years ago

What if a data script is hard to calculate but easy to validate? With current design it's the job of a transaction producer to do all the heavy computations, otherwise every node/wallet would have to replicate it.

Regarding the problem that we discuss here, we always have to calculate the next data script in the validator in order to check that the redeemer's one is correct. There are cases where we don't need state machines and thus don't need to verify that the redeemer contains a valid state and we certainly should handle those cases somehow, but with the current design it seems we can't handle the dual situation where we have a data script that is easy to calculate, but hard to check that what was calculated and put into the redeemer is indeed correct. See:

Also, if the data script contains functions, how are we going to check equality of them?

mchakravarty commented 5 years ago

@effectfully wrote,

We already have to replay all the events of the blockchain in order to know its current state. As long as computing the state of the validator happens off-chain, it seems to align well with other parts of the system.

But can we allow the validator to somehow publish its data script? Like trigger a separate kind of transactions, I don't know.

No, we cannot do that. The two things you mention are exactly the two things that Ethereum allows and Bitcoin-style UTxO doesn't: (1) publish the result of computation inside the validator (i.e., change the state of the blockchain as a result of validator computation) and (2) issue new transactions from inside the validator.

Changing this would fundamentally change the blockchain. One of the core concepts of Extended UTxO is that it adds some extra information to outputs and validator invocation, but it leaves the structure and dataflow of UTxO the exact same as before.

Apart from that, I think, it wouldn't be a good idea, going back on this is nothing we could decide uni-laterally as it has a serious impact on Cardano core. It would also have implications on the whole discussion on execution costs, killing the purity we just talked about in Miami.

I very well understand that I'm probably asking completely nonsensical questions here, I just feel really weird about verifying that what you got is what you already have and so I think it's natural to ask whether there is any way to avoid this by changing the design of the system.

These are perfectly legit questions. However, we do what we do for a very good reason (what I wrote above). Yes, doing the data script validation in the validator is a bit of a hassle, but it is the validator after all and a bit of hassle here is a minor inconvenience compared to impacting the fundamental ledger structure.

effectfully commented 5 years ago

I see. Thanks a lot for the explanations, Manuel.

kwxm commented 5 years ago

See also https://github.com/input-output-hk/plutus/issues/1436, which may avoid this problem entirely.

kwxm commented 5 years ago

We've now implemented #1436, so I'm going to close this (yay!).

IntersectMBO / plutus

Validating the next data script #426

The problem

Possible solutions