sipa / bips

Bitcoin Improvement Proposals
bitcoin.org
143 stars 43 forks source link

Rewrite leaf versions rationale #185

Closed sipa closed 4 years ago

sipa commented 4 years ago

Instead of having seemingly-arbitrary numbers (0xc0) for leaf versions, redefine leaf version to just be the 5 relevant middle bits of the control block first byte. Use xor rather than masking to not theoretically rule out the ability to use leaf versions which don't set the top 2 bits.

EDIT: after significant revisions by @ajtowns, it no longer actually changes the leaf version definition, instead just improving the rationale.

ajtowns commented 4 years ago

Maybe it would be better to separate the "0xc0" value and the "0" value and call them different things, something like:

  • Let i = c[0] & 0xfe and call it the script rules identifier.

How is the script rules identifier chosen? We can support up to 128 different sets of script rules with the 7-bit script rules identifier, so we associate leaf version v (in the range 0 to 127) with script rules identifier i = ((v*2) + 192) % 256. We use this offset because some types of static analysis may benefit from the ability to analyse script spends without access to the output being spent. We can achieve that by looking at the first byte of the last witness element after removing the annex if present -- a value of 0x02 or 0x03 implies a P2WPKH or P2WSH spend, a value matching one of the valid P2WSH opcodes implies a P2WSH spend, and a value matching 0xc0 to 0xff implies spending via a leaf version between 0 and 31. With this arrangement it is also possible to distinguish leaf versions 83 (VERNOTIF,ELSE), 95 (CAT,SUBSTR), 96 (LEFT,RIGHT), 98 (AND,OR), 107 (DIV,MOD), 108 (LSHIFT,RSHIFT), and 125, 126, and 127 (undefined) from P2WSH and P2WPKH spends. Use of other leaf versions potentially conflict with valid P2WSH spends, so should be avoided if possible. In addition use of leaf version 72 with script rules identifier 0x50 would cause ambiguity with detecting the presence of an annex and should not be used.

That way the spec can talk about the "script rules identifier" everywhere, with "leaf version" only appearing in rationale sections, which might make sense. Patch which makes other related changes from leaf version to script rules identifier for consideration at https://github.com/ajtowns/bips/commits/202001-scriptruleid (probably went a bit overboard in the explanation so feel free to aggressively rewrite if you like)

sipa commented 4 years ago

@ajtowns Updated to your branch, and changed PR summary/description.