Consider adding "batch" transactions

michaelpj commented 5 years ago

Suppose you have a chain of transactions, A -> B -> C, where each one spends outputs from the previous one. (This is only really a thing you want if you have scripts or multi-currency, since for "normal" transactions you can usually just merge them.)

At the moment, such a sequence will take |sequence| slots to fully validate, if we wait for each transaction to be successfully validated before submitting the next one, which is rubbish.

It would be nice to be able to "batch" such transactions.

Concretely, I propose an new alternative TxBody:

data TxBody = .... | Sequence [TxBody]

What does this mean?

The semantics are simple: it's exactly the same as applying each TxBody in sequence. It validates iff the whole sequence does. If part of it fails to validate then the whole thing is rejected.
It's a new kind of TxBody, which allows us to have witnesses over the whole sequence.
- This would enable other use cases where multiple parties could coordinate to create transaction batches that interact with resources owned by different parties, and only proceed if the whole batch is signed. This allows e.g. atomic swaps.
- The alternative would be data Tx = ... | Sequence [Tx], which would also work fine for batching.

I think this would be relatively straightforward to add. The wallet backend would need to know about them, but I don't expect the frontend would, since users probably don't want to make these very often. Rather, the clients are likely to be "smart contracts"/wallet backend applications.

michaelpj commented 5 years ago

This increasingly seems like something we'll want. In particular, at the moment you end up needing an administrative transaction to "jump start" use of a new currency. If we want people to use tokens heavily for e.g. roles then we want it to be as lightweight as possible, and having to do multiple transactions is slow and/or racey.

cc @polinavino

michaelpj commented 5 years ago

Flipping this around: this says that we can retain the property that dependent transactions can always be merged even in the presence of scripts, which is nice!

In fact, currently you can only merge dependent transactions if you're the author of both, since you need to sign the combined transaction which is different from the constituents. Batch transactions would let you merge transactions without disturbing the signatures.

polinavino commented 4 years ago

I am not totally convinced of the advantage of having batch transactions. I think the type of functionality you're aiming for is better provided by things like Hydra? If i understand correctly.

One thing that does concern me slightly is (due to the addition of the 2-phase validation for running scripts) the possibility of the creator of a block simply omitting a valid transaction that several other transactions depend on.

Here, I mean depend on in a way that does not cause the whole block to be scrapped, but rather phase 2 validation failures that result in more total fees being collected from the transactions in this block. Here, recall that a second-phase failure collects all fee-marked inputs, that could have greater total value than the required script fees that would be collected if all scripts validate.

Could this happen??

michaelpj commented 4 years ago

I think the type of functionality you're aiming for is better provided by things like Hydra or Lightning?

Could you elaborate?

Here, I mean depend on in a way that does not cause the whole block to be scrapped, but rather phase 2 validation failures that result in more total fees being collected from the transactions in this block.

I'm pretty sure this can't happen. The only way that transactions can depend on each other is via transaction outputs. So the creator of a block could either:

Omit a transaction. At worst later transactions will fail in phase 1 due to spending missing outputs.
Tamper with the outputs produced by an earlier transaction. But that would invalidate the signatures, and the txids, and all the stuff we have to prevent this.

(Also, I don't think this is about batch transactions per se, but rather about the process of validation in general?)

polinavino commented 4 years ago

I'm pretty sure this can't happen. The only way that transactions can depend on each other is via transaction outputs. So the creator of a block could either:

Omit a transaction. At worst later transactions will fail in phase 1 due to spending missing outputs.

Tamper with the outputs produced by an earlier transaction. But that would invalidate the signatures, and the txids, and all the stuff we have to prevent this.

(Also, I don't think this is about batch transactions per se, but rather about the process of validation in general?)

Yeah, it is about validation in general. I was picturing a situation where if tx1 is processed, the UTxO will have in it (hash tx1, ix) |-> (addr1, value1, datahash1)

If it does not get processed, there will not be any entry in the UTxO indexed by (hash tx1, ix). If tx2 is being processed and trying to spend this entry, it will be phase 1 validation failure. If that entry is indeed in the UTxO, then the rest of the arguments passed to the validator cannot be effected by whether tx1 was processed or not. So, in short, looks like you are right and this cannot happen.

polinavino commented 4 years ago

I am not totally convinced of the advantage of having batch transactions. I think the type of functionality you're aiming for is better provided by things like Hydra? If i understand correctly.

Also never mind about this - I am convinced now that batching functionality could be useful

jmchapman commented 4 years ago

Alternative proposal sketch:

Instead of adding an additional transaction type to support batching/atomic sequences of transactions, relax the validation rules to allow batching in a single transaction.

For ordinary (non script) transactions, it is possible to batch together several payment into one as exchanges regularly do.

If we were to allow the spending of outputs within the same transaction as they are created then we could support the same 'monoidal' behaviour as before.

@michaelpj pointed out that this has the shortcoming that currently validator scripts can require very specific constraints on a transaction such as there are exactly a particular number of inputs etc. However, I think it's worth considering if such constraints are really needed/are not too intentional.

polinavino commented 4 years ago

There might be some way to spend the outputs generated within the transaction, but it will involve some way of being clever about having references to those outputs inside the transaction itself. The usual way, (hash tx, ix) means that the hash of the transaction has to be calculated and put inside that same transaction - which can't happen.

michaelpj commented 4 years ago

It just seems very hard to have this work nicely. Even very simple things will break. Consider e.g. someone who tries to find all the money in some currency by summing up all the outputs. Except - oops, in the merged trasnsaction it gives a completely different answer.

We would require validator authors to program very defensively against this condition, which they probably wouldn't, and which we could never assume that they had, so merging together random transactions would usually be unsafe.

So I think you'd really need to have the inputs/outputs "for each transaction" segregated and only seen by the validators in that tx. And then probably the same applies for the other tx metadata. And then you've just got batch transactions again.

jfischoff commented 3 years ago

I would really like to have this.

There are many interesting DeFi applications which would benefit from batching. Some, might be impossible to port from Ethereum without batching.

Batching also seems important internally for being able to support rollup like Layer 2s.

Conceptually I could see how validation would work with batching.

Are there any plans to implement this?

yihuang commented 2 years ago

It could be simply a heuristic rule for the mempool and block propose logic, like the honest nodes will propagate the batch together, and include them into the block together. But even if a node breaks them up, they can be still valid, we don't need to change ledger rules for this I guess.

Unless to implement the babel fee, we need a ledger rule change to allow temporary liability in the middle of the batch. Since we are here, I wonder can we implement the flash loan with this temporary liability?

JaredCorduan commented 2 years ago

this would make a great CIP

michaelpj commented 2 years ago

This will overlap with limited liabilities (which also needs a CIP!).

L-as commented 2 years ago

At the moment, such a sequence will take |sequence| slots to fully validate, if we wait for each transaction to be successfully validated before submitting the next one, which is rubbish.

I was under the impression that there is no such limitation? Where is this encoded in the ledger?

E-mail from Duncan (I am not the one replied to) for reference:

> As a follow-up question: if a node's validation context includes the
> pending transactions that it has validated previously, does that mean
> that it can validate a transaction that attempts to consume an output
> from one of those pending transactions?

Yes indeed! Because as I note above, it's a sequence of transactions,
and the validation context is a ledger state, then yes there can be
dependent transactions. This is perfectly legitimate, and the
transaction propagation mechanism is designed to avoid reordering them.
So it's actually a use case that's supported by the design.

> In other words, is it possible for a chain of transactions to be
> added to the same block?

Yes. Now of course in practice the transactions have to be submitted in
order, so this is easier to arrange if it's one agent that is
submitting them all in a sequence.

This can also be noted from the section about colours from https://raw.githubusercontent.com/cardano-foundation/CIPs/ouroboros-leios/CIP-XXXX/leios-design.pdf (https://github.com/cardano-foundation/CIPs/pull/379)

jmhrpr commented 2 years ago

At the moment, such a sequence will take |sequence| slots to fully validate, if we wait for each transaction to be successfully validated before submitting the next one, which is rubbish.

I was under the impression that there is no such limitation? Where is this encoded in the ledger?

@L-as I read it as if you are submitting a sequence of transactions and want to see they are each valid before submitting the next then it will take a ~few slots. With them batched it would be a single submission. (but not sure)

What you describe about having transactions which depend on transactions which come before it in the same block is possible and already seen on-chain.

michaelpj commented 2 years ago

I was under the impression that there is no such limitation? Where is this encoded in the ledger?

Indeed the key part was "if we wait for each transaction to be successfully validated before submitting the next one", i.e. if you want to know that transaction T1 was definitely added to the chain before submitting T2.

L-as commented 2 years ago

I don't see the use case for that scenario, though I definitely do want batching.

dcoutts commented 2 years ago

@michaelpj but why would you want to wait to know that T1 was definitely added before submitting T2? Given that T2 depends on T1, we don't need to know that T1 already made it into a block. If T1 were invalid, T2 will also be rejected. I don't see a problem.

Yes, batching would give atomic behaviour, which may be useful, but it's simply not the case that a sequence of dependent txs needs N blocks to incorporate onto the chain. The system is designed to allow submitting dependent transactions all in one go and have that work.

michaelpj commented 2 years ago

I'm no longer sure that non-atomic batches are that useful outside of limited liabilities (which changes things).

I do think it's quite non-obvious that the mempool has this property, and it's unclear if it's guaranteed at all. So perhaps we just need better documentation.

L-as commented 2 years ago

Non atomic batches can implement atomic batches potentially. Will put down my thoughts later.

L-as commented 2 years ago

Reason to have at least basic batching: Some times a batch of transactions can be profitable even if the individual steps aren't all profitable. Example: UTXO locked with script that allows you (and only you) to relock the funds with a script that allows anyone to consume it. We'd like to take the funds and use it on something, which is technically possible in two transactions, but what if upstream drops the second transaction and only uses the first? We obviously want to prevent this, and a way to sign the entire batch such that the entire batch has to go through at once would be immensely useful. It's possible that such a batch should have the same limits as a single transaction, since it acts like a single atomic transaction. The same reasoning as for transaction applies here.

michaelpj commented 1 year ago

Here's a summary of my current thinking on what we might want to do:

Atomic batches

An atomic batch is a batch transaction that cannot be split apart, and all of the transactions must apply in order for any of them to be applied.

Atomic batches are desirable because:

They restore the monoidal property of transaction bodies: you can now always merge two transaction bodies by making them into a batch.
They can let you avoid scripts, e.g. you can implement a simple atomic swap with just atomic batches and no scripts.
They can allow you to take a series of actions with no possibility of someone else intervening in the middle. As a somewhat contrived example, you could withdraw money from a DeFi contract in a way that would render you vulnerable to liquidation, use it for something, and then put it back where it was, all in one atomic move.

One concern is to ensure atomicity in the face of interference: we don't want other users to be able to disassemble batches, and resubmit only parts of them, as that would violate atomicity. This probably means: a) we want the batch to be signed b) the signatures must be resistant to stripping c) it must not be possible to extract a part of the batch that is a valid transaction by itself

The only sensible way of doing this that I can see is for an atomic batch to be a new kind of transaction body, which is a sequence of transaction bodies. Then the witnesses (including signatures) are shared across the batch and naturally ensure integrity of the entire batch. Since the signatures are for the entire transaction body (the batch) they can't be reused for a subset of them.

Non-atomic batches

A non-atomic batch is a batch that can be broken apart.

Non-atomic batches are useful primarily in cases where you want multiple parties to collaborate in creating batches without two way communication (to e.g. agree on signing the whole thing).

The only real usecase that I know of for non-atomic batches is from the limited liabilities paper, where:

Liabilities are allowed, but only inside batches
Other parties can an incentive to "complete" a partial batch, allowing for a unidirectional matching system

Other than that, so long as nodes typically respect them they allow you to explicitly state the hint that you can implicitly give today by just submitting your transactions to your local node in that order. But if we had atomic batches then you could just use an atomic batch. In the context of Leios possibly this hint becomes more valuable if it can replace the use of colours with some thing more semantic - but again you could just use an atomic batch.

Since the pieces of a non-atomic batch should be valid independently, we must not sign them together. So a non-atomic batch would have to be a new kind of transaction that just consisted of a sequence of transactions.

L-as commented 1 year ago

Non-atomic batches are AFAICT not a suitable replacement to colouring. The important thing to note, is that if you can have liabilities inside them, then inherently the batch must be cancellable. This is problematic. Colours have no such behaviour, and a sequence of same-colour transactions can be potentially infinite with no issue.

Another thing I'm wondering about, is how are fees handled for atomic batches? I have over the last year spent considerable brain cycles on protocol interoperability, but have still not found a solution. Are atomic batches really true monoids? What if it doesn't fit in a block? Why would an atomic batch have a different size limit than a standalone transaction? In the first place, it's not clear to me what the exact reasoning is for the limits. The core issue of protocol interoperability is that there is no true monoid. Even if two protocols work separately, it is not always possible to combine them due to limits. If the limit is practically unreachable, that's not a worry, but this is far from the case currently. If any of you want to discuss protocol interoperability I'm up for that (in real-life or digitally).

michaelpj commented 1 year ago

Colours have no such behaviour, and a sequence of same-colour transactions can be potentially infinite with no issue.

Why are these properties desirable, though? I'm wary of adding new desiderata based on the coincidental properties of the colour system. The stated reason for wanting colours is essentially to allow people to retain the current behaviour where they can submit a sequence of dependent transactions and have them processed in order. Batching seems like a more declarative way to get that behaviour without needing colours.

Are atomic batches really true monoids? What if it doesn't fit in a block? Why would an atomic batch have a different size limit than a standalone transaction?

They are as monoidal as pre-Alonzo transactions used to be, or Bitcoin transactions are. In essentially all cases we have size limits to deal with, so it's never going to be possible to get a true monoid. But we can get a monoid-up-to-limits, which is something. It's pretty convenient for building transactions and it does allow pasting things together in many cases.

It would be cool if we got a true monoid. That would require the individual bits of the batch to have independent limits, though :thinking:

L-as commented 1 year ago

Because often you have an infinite chain of dependent transactions working on the same protocol.

I'm still thinking of ways to get something that is mostly a monoid.

michaelpj commented 1 year ago

Okay, this is a new goal: support infinite sequences of dependent transactions, emitted at a steady rate (is that right?).

That doesn't seem to me like something that fits into the metaphor of "batches" and is something quite different.

IntersectMBO / cardano-ledger

Consider adding "batch" transactions #768