buidl-bitcoin / buidl-python

python3 bitcoin library with no dependencies and extensive test coverage
https://pypi.org/project/buidl/
MIT License
83 stars 26 forks source link

tap_script.raw_serialize() not serializing data correctly #159

Closed katsucodes247 closed 6 months ago

katsucodes247 commented 6 months ago

Hi, I've been trying to create taproot script path spend transaction but have been hit a wall. I can't say with certainty if the issue is in the library or with how I use it.

from buidl.taproot import TapLeaf, TapBranch, TapScript
from buidl.script import Script

# script: 2 OP_ADD OP_4 OP_EQUAL
# script = TapScript([2, 147, 4, 135])
script = Script([2, 147, 4, 135])
leaf = TapLeaf(script)
>>> print(leaf)
OP_[2] OP_ADD OP_[4] OP_EQUAL

Then I serialize the tapleaf in order to include to tx_in.wintness.items but I get an unexpected serialize data which also fails to get decoded correctly when using bitcoin-cli's decodescript.

>>> leaf.tap_script.raw_serialize()
b'\x02\x93\x04\x87'
>>> leaf.tap_script.raw_serialize().hex()
'02930487'

When I decode 02930487 using bitcoin-cli I don't get the expected result (which is OP_2 OP_ADD OP_4 OP_EQUAL:

$ ./bitcoin-cli decodescript 02930487
{
  "asm": "1171 OP_EQUAL",
  "desc": "raw(02930487)#gr8elmet",
  "type": "nonstandard",
  "p2sh": "2N5EYA8QZ6RQoa5Ethst2x9hMK4Vpr5AjYV",
  "segwit": {
    "asm": "0 8a75c8eaf8c3f312b430ae4d34f030cbe9d3788b2ff6b1d55968e2affe051be4",
    "desc": "addr(bcrt1q3f6u36hcc0e39dps4exnfupse05ax7yt9lmtr42edr32lls9r0jqencf26)#y974g9ej",
    "hex": "00208a75c8eaf8c3f312b430ae4d34f030cbe9d3788b2ff6b1d55968e2affe051be4",
    "address": "bcrt1q3f6u36hcc0e39dps4exnfupse05ax7yt9lmtr42edr32lls9r0jqencf26",
    "type": "witness_v0_scripthash",
    "p2sh-segwit": "2N8kDTb6sHti3vEHYtG9nDYuVRhWHmXcqx6"
  }
}

I've tried to make sense if the problem is in the way I use the library or not, but haven't been successful so far. Any pointer would be greatly appreciated.

katsucodes247 commented 6 months ago

Another potentially interesting observation, when I parse a hex string the scripts parses it correctly, but when serializing, it serializes it into a different hex string?

>>> WitnessScript.parse(BytesIO(bytes.fromhex("52935487")))
OP_ADD OP_4 OP_EQUAL 
>>> WitnessScript.parse(BytesIO(bytes.fromhex("52935487"))).is_witness_script()
False
>>> WitnessScript.parse(BytesIO(bytes.fromhex("52935487"))).raw_serialize().hex()
'935487'
>>> WitnessScript.parse(BytesIO(bytes.fromhex("52935487"))).serialize().hex()
'03935487'
>>> Script.parse(BytesIO(bytes.fromhex("52935487")))
OP_ADD OP_4 OP_EQUAL 
>>> Script.parse(BytesIO(bytes.fromhex("52935487"))).serialize().hex()
'03935487'
>>> Script.parse(BytesIO(bytes.fromhex("52935487"))).raw_serialize().hex()
'935487'
>>> TapScript.parse(BytesIO(bytes.fromhex("52935487")))
OP_ADD OP_4 OP_EQUAL 
>>> TapScript.parse(BytesIO(bytes.fromhex("52935487"))).serialize().hex()
'03935487'
>>> TapScript.parse(BytesIO(bytes.fromhex("52935487"))).raw_serialize().hex()
'935487'
katsucodes247 commented 6 months ago

I've found an issue with OP_1 -> OP_16 range.

| word          | int   | hex       |
|---------------|-------|-----------|
| OP_1NEGATE    | 79    | 0x4f      |
| OP_1, OP_TRUE | 81    | 0x51      |
| OP_2-OP_16    | 82-96 | 0x52-0x60 |

And this is how buidl does it:

>>> from buidl.op import *
>>> encode_num(2)
b'\x02'
>>> encode_num(2).hex()
'02'
>>> from buidl.op import *
>>> stack = []
>>> op_2(stack)
True
>>> stack
[b'\x02']
katsucodes247 commented 6 months ago

Closing this because. I shouldn't have passed in 2 and 4 ints. I was confused because script = Script([2, 147, 4, 135]) evaluated into OP_[2] OP_ADD OP_[4] OP_EQUAL and I thought that 2 got changed into OP_2 but it turns out OP_2 !== OP_[2].

>>> Script.parse_hex('52935487')
OP_2 OP_ADD OP_4 OP_EQUAL 
>>> Script.parse_hex('52935487').commands
[82, 147, 84, 135]
>>> Script([2, 147, 4, 135])
OP_[2] OP_ADD OP_[4] OP_EQUAL
>>> Script([2, 147, 4, 135]).commands
[2, 147, 4, 135