jimmysong / programmingbitcoin

Repository for the book
Other
1.75k stars 656 forks source link

Ch. 13 - Unnecessary complexity in parsing and serializing witness data in class Tx #288

Open salmonberry7 opened 6 days ago

salmonberry7 commented 6 days ago

In the Tx class method parse_segwit, following the BIP141 spec, the code :

for tx_in in inputs:
    num_items = read_varint(s)
    items = []
    for _ in range(num_items):
        item_len = read_varint(s)
        if item_len == 0:
            items.append(0)
        else:
            items.append(s.read(item_len))
    tx_in.witness = items

can be simplified to :

for tx_in in inputs:
    # read the witness field for input tx_in
    num_items = read_varint(s)
    items = []
    # each element in 'items' will be a byte sequence, possibly of length zero
    for _ in range(num_items):
        items.append(s.read(read_varint(s)))
    # 'items' itself may have length zero
    tx_in.witness = items

so the witness is just a list of byte sequences. There is no need for some elements in this list to be the integer zero, instead an empty byte sequence (as returned by s.read(0)) can be used.

The Tx class method serialize_segwit, again following BIP141 spec, would then be simplified correspondingly from :

for tx_in in self.tx_ins:
    result += int_to_little_endian(len(tx_in.witness), 1)
    for item in tx_in.witness:
        if type(item) == int:
            result += int_to_little_endian(item, 1)
        else:
            result += encode_varint(len(item)) + item

to :

for tx_in in self.tx_ins:
    result += encode_varint(len(tx_in.witness))
    for item in tx_in.witness:
        result += encode_varint(len(item)) + item

It is not necessary for certain items within the witness to have type integer and others to have type byte sequence and encode_varint should be used throughout rather than int_to_little_endian, in accordance with BIP141 spec.