Open nothingmuch opened 1 month ago
It seems to me that Payment Cascade is a special case of the scenarios. Communication from A -> B -> C -> B -> A is serial and introduces no additional rounds of communication. In order for Bob and Carol to add input, you do note that the requirement for Carol would need relaxed requirements, since Bob cannot both add new signed inputs without invalidating Alice's input.
However, a narrow case of Payment Cascade is possible with a mere 'soft fork' (additional requirement) to the payjoin protocol, if you will. Bob may allocate some of his output paid from Alice in the Original PSBT to Carol, without adding any inputs, and pass the Original PSBT to Carol who may add inputs and alter Bob's payment output such that a Payjoin PSBT would satisfy Alice's payjoin checklist according to BIP 77.
Other than Payment Cascade, all of the scenarios appear to require more rounds of communication.
I must note that concrete privacy seems to be left out from this writing, on purpose. Without further client-side transaction construction preferences, the transactions produced by such a protocol would be analogous to blockchain.info's SharedCoin, which batched multiple transaction intents, without any mitigation against CoinJoin Sudoku analysis. In other words, these SharedCoin-style batches' privacy properties are likely vulnerable to simple subset-sum analysis because they don't produce transactions with any ambiguity guarantees. Without further cooperation, savings and thus scaling benefits are minimal, and an inexpensive algorithm can link input and output clusters and payment behavior. Discussing a brittle protocol as a starting point demonstrates the potential and limitations of the most basic multiparty transaction construction.
@nothingmuch proposes that the Payjoin Directory Server support multiparty cooperation by storing incrementally augmented PSBTs in subdirectories. Each party may propose an updated PSBT in a new subdirectory with their selected Input and Output details.
Since subdirectories are pairwise and reading them is authorized by possession of a public Session key, "anyone who learns the session key could add superfluous TxOuts without even contributing inputs," introducing a Denial of Service attack vector. I.e. a malicious client could propose inputs and outputs with no intention of constructing a transaction with, only the intention to disrupt cooperation. I believe this statement depends on the ability to append to subdirectory contents. He suggests mitigating this by requiring input ownership proofs (BIP 322 signed messages proving control of an input) to authorize input registration, and homomorphic value commitments which I understand as a limitation on the amount of output value that may be registered based on proven input value.
Appending existing PSBTs to existing subdirectories would be more complex yet more efficient (in terms of storage, I presume) than having each updated PSBT state in its own subdirectory.
Input providers only proceed with signed PSBTs containing their requested outputs.
I imagine a mechanism that allows also for proofs that requested outputs are present, rather than plain output checking, so that Alice can send funds to Bob without knowledge of exactly which output(s) Bob ends up with (or chooses to forward to others). Just asking the counterparty that you are paying they got what they wanted is sufficient.
@nothingmuch proposes a Message Transcript, a log of all idempotent operations, to ensure all parties agree on the set of intermediate messages before signing as I understand. Such a transcript verifiably specifies exactly which PSBT gets produced.
All transaction participants will need to agree on the following global transaction parameters
By this, the author notes that this protocol is not robust, it's sensitive to unexpected inputs. It may fail due to unmitigated Denial of Service or liveliness problems. The protocol defined here may not recover from network disruptions or malformed input.
Just some initial thoughts:
This can be addressed by using BIP 322 ownership proofs to authenticate inputs (though not as strongly as BIP 77's original PSBTs)
in addition to bip322 ownership proofs I believe there is also a need to prove the UTXOs are actually on chain at a suffeciently deep depth -- to prevent re-orgs. An SPV proof here should work fine. A more far out idea is to use STARKS here. There are a number of projects currently being developed to solve this exact problem.
Safety can be enforced by each participant only signing with all their inputs if all their requested outputs are present.
I believe signers also need to be stateful of previous payjoin rounds and what inputs they had been contributed. This is only an edge case if we support retrying. For example if a peer witheld their signature causing the round to fail they could broadcast the tx later on after a later payjoin round succeeds. Thus double spending to the same outputs. This is preventable however if all signers use the same inputs. Or if there is at least one conflicting input.
Does this protocol consider outputsplitting in the knapsack mixing sense?
in addition to bip322 ownership proofs I believe there is also a need to prove the UTXOs are actually on chain at a suffeciently deep depth -- to prevent re-orgs. An SPV proof here should work fine. A more far out idea is to use STARKS here. There are a number of projects currently being developed to solve this exact problem.
rejecting confirmed spent coins is more important IMO, there's good reasons to support double spending unconfirmed inputs (e.g. RBF)
that said there's definitely good reasons to only want to have older coins as counterparties... knowing the confirming block or having a full SPV/merkle inclusion proof is still very meaningful as a DoS because confirming PoW is costly
I believe signers also need to be stateful of previous payjoin rounds and what inputs they had been contributed. This is only an edge case if we support retrying.
retrying should be supported, but i think that kind of double spending / equivocation should be at least allowable by opt in (or even by default depending on circumstances), especially with full rbf making it much simpler to replace txs (respecting BIP 125 rules is possible, but messy in multiparty setting especially if more than one input was previously spent by a soon to be replaced tx, and those inputs can/should share the replacement fee obligation for the replaced tx).
that said, for that to be meaningfuul it requires DoS protection on output addition and tx space usage, which requires anonymous credentials, etc, definitely out of scope for this "brittle" strawman version
Does this protocol consider outputsplitting in the knapsack mixing sense?
for an initial version i think it's orthogonal and entirely up to the client
assuming it's extended with some of the coalition formation stuff (discussed with Dan, not yet written up in this context) it may make sense to have some constraints on when a quorum is formed that takes into account subset sum density as a constraint, but enforcing splitting directly would require some knowledge of user clusters so even then it's not clear to me how to make that an assurance rather than just signalling a preference (input set sumset sum density can be enforced which seems sufficient)
Multiparty cut through scenarios
The following are cut-through scenarios which are not supported in BIP 77, but might improve block space utilization and privacy.
In particular, privacy from counterparties could be obtained in a multiparty setting, since parties to a payment could avoid learning each others' inputs and outputs in principle, just not in the 2 party case which leaves no room for guessing.
Payment Cascade
Alice initiates a payjoin to Bob, who then initiates a payjoin to Carol.
In the current protocol, Bob couldn't take Alice's original PSBT, modify it to a payjoin PSBT, and then send that as his original PSBT to Carol, since he can't sign for Alice and original PSBTs must be fully signed.
Suppose it not problematic to accept unsigned original PSBTs. In that case there wouldn't be a problem for him to wait for Carol to respond with her payjoin PSBT, fill in the signatures for his own inputs and pass that to Alice as his own payjoin PSBT.
Payment Cycle
If Carol additionally wants to pay Alice in the same transaction, initiating a payjoin to Alice, an additional restriction is that she is not allowed to modify her own change output(s) from the first original payjoin, since for she is supposed to know those might be Carol's. This presents an an inefficiency and a privacy leak for Alice with respect to 3rd parties (not Carol).
Suppose this restriction was also lifted, Alice could reply to Carol with a payjoin PSBT with all her inputs signed, Carol would amend it with her signatures and finally Bob's signatures would complete the transaction, which he could still send as a payjoin PSBT to Alice to complete the interaction.
Note that this is a strawman, all parties learn that there was a payment cycle, and likely each others' output mappings, on top of the hypothetical relaxations to the protocol exposing clients to more privacy leaks.
Fan-in & Fan-out
On the other hand if Alice and Bob both initiate a payment to Carol, if she merges the two original PSBTs she can't sign Alice or Bob's inputs and therefore can't respond to either with a payjoin PSBT only missing signatures from one or the other, relaxing the specification would not suffice since an additional round of communications would be required.
The same basic argument precludes Alice initiating two concurrent payjoins to Bob and Carol and merging those transactions. In this case Alice could construct a payjoin with Bob, paying Carol, and send the final output as an original PSBT to Carol, but if Carol wishes to modify it then there is no way to ask Bob to sign again.
Towards a multiparty protocol
A (brittle) multiparty protocol can be built on top of a slightly modified directory.
Multiparty sessions can be created if either receiver or sender share in a payjoin shared the ephemeral private key with their direct counterparties. In the receiver's case this could be in response to an original PSBT, in the BIP 21 URI, using BIP 353 etc.
All parties who learn the key could then both read and write to/from the subdirectory, as well as invite others. This implies that such multiparty transactions only happen between economically proximate entities, which limits the quality/robustness of privacy, but is still an improvement.
To avoid introducing a concept of appending to subdirectories, after a subdirtory has been written to, its ID and the written value could be hashed to derive a new subdirectory key, but it would be more efficient to allow appending values.
Using such a subdirectory, parties could then unsigned TxIns and TxOuts in any order. Once finalized signatures can be added. By phasing TxIns and TxOuts, silent payments can be supported at the cost of an additional round of communication.
A multiparty protocol presents more possibilities for DoS, and liveness is not guaranteed, anyone who learns the session key could add superfluous TxOuts without even contributing inputs. This can be addressed by using BIP 322 ownership proofs to authenticate inputs (though not as strongly as BIP 77's original PSBTs), and homomorphic value commitments to enforce output integrity. Note that this is independent of DoS protection for the directory itself, which could be addressed by rate limiting using anonymous credentials.
Safety can be enforced by each participant only signing with all their inputs if all their requested outputs are present. To ensure duplicate TxOuts are properly accounted for and not accidentally spent as fees (e.g. when two users donate the same amount to the same static address), the message transcript hash can be the random seed in a shuffling of the outputs. A random permutation of 25 outputs has 83 bits of entropy, so even smaller transactions would provide reasonable assurances that all parties agree on the set of messages. Note that the transcript must either be the encrypted payloads, or the cleartext ones must contain idempotence IDs and for the shuffling to commit to those.
Parties will need to agree (a priori) on a reasonable (minimum) feerate, values for nlocktime & version, and any constraints on nSequence a priori, or restriction of allowed input types (e.g. to ensure a stable txid legacy inputs might be disallowed). Given that duplicate outputs can be detected, whether or not they should be merged can also be specified. The specific parameters governing such rules should be committed to by the session key.