1200wd / bitcoinlib

Bitcoin and other Cryptocurrencies Library for Python. Includes a fully functional wallet, Mnemonic key generation and management and connection with various service providers to receive and send blockchain and transaction information.
http://bitcoinlib.readthedocs.io/
GNU General Public License v3.0
596 stars 199 forks source link

Error in transaction parsing: Incorrect pubkeyhash length #369

Open pputnik opened 7 months ago

pputnik commented 7 months ago

Way to reproduce:


from bitcoinlib.services.bitcoind import *
bdc = BitcoindClient.from_config('/user/bitcoin.conf', strict=False)

txid = "2b3733a39b87cef1f7c2ca8988583ec0c1a886651b040e3dbaf4c26be81d0f75"
t = bdc.gettransaction(txid)
print(t.as_json())

The error:

Traceback (most recent call last):
  File "/user/test.py", line 11, in <module>
    print(t.as_json())
          ^^^^^^^^^^^
  File "/user/.pyenv/versions/3.11.4/lib/python3.11/site-packages/bitcoinlib/transactions.py", line 1791, in as_json
    adict = self.as_dict()
            ^^^^^^^^^^^^^^
  File "/user/.pyenv/versions/3.11.4/lib/python3.11/site-packages/bitcoinlib/transactions.py", line 1758, in as_dict
    outputs.append(o.as_dict())
                   ^^^^^^^^^^^
  File "/user/.pyenv/versions/3.11.4/lib/python3.11/site-packages/bitcoinlib/transactions.py", line 1351, in as_dict
    'address': self.address,
               ^^^^^^^^^^^^
  File "/user/.pyenv/versions/3.11.4/lib/python3.11/site-packages/bitcoinlib/transactions.py", line 1254, in address
    address_obj = self.address_obj
                  ^^^^^^^^^^^^^^^^
  File "/user/.pyenv/versions/3.11.4/lib/python3.11/site-packages/bitcoinlib/transactions.py", line 1245, in address_obj
    self._address_obj = Address(hashed_data=self.public_hash, script_type=self.script_type,
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/user/.pyenv/versions/3.11.4/lib/python3.11/site-packages/bitcoinlib/keys.py", line 626, in __init__
    self.address = pubkeyhash_to_addr(self.hash_bytes, prefix=self.prefix, encoding=self.encoding,
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/user/.pyenv/versions/3.11.4/lib/python3.11/site-packages/bitcoinlib/encoding.py", line 650, in pubkeyhash_to_addr
    return pubkeyhash_to_addr_bech32(pubkeyhash, prefix, witver)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/user/.pyenv/versions/3.11.4/lib/python3.11/site-packages/bitcoinlib/encoding.py", line 706, in pubkeyhash_to_addr_bech32
    raise EncodingError("Incorrect pubkeyhash length")
bitcoinlib.encoding.EncodingError: Incorrect pubkeyhash length

The only reason I can think about - it's a block reward but unlike what I've seen before, there are many recipients (not one).

Could you help, please?

pputnik commented 7 months ago

p.s. it was ver 0.6.12. Ver 0.6.14 throws unexpected keyword argument 'strict' but it was needed by https://github.com/1200wd/bitcoinlib/issues/338

Without that "strict" the same error as above.

arshbot commented 6 months ago

confirmed this is an inelegant way to fail for a nonstandard output. @pputnik this is caused because the output 195 in that tx appears to be custom and doesn't have an address in the traditional sense.

@mccwdev happy to create a pr to handle this properly. I suggest supporting script_type=nonstandard. This has the side effect of allowing malformed strings p2tr, etc strings pass through without raising errors, instead being misidentified as type unknown or nonstandard.

happy for an alternative approach.

pputnik commented 6 months ago

@arshbot could you please explain why it "doesn't have an address in the traditional sense"? I see the block reward is passed to 196 addresses. Does the error appear because the block reward passed not to one but to many?

Thank you.

arshbot commented 6 months ago

@pputnik when you "send" bitcoin, you're really just putting a new lock (scriptpubkey) on a subset of coins you own. We use the same kind of locks all the time, most of the time use a pubkey in some form/fashion/placement. These patterns of locking and pubkey storage often map to an address type (p2pkh, p2sh, p2tr, etc).

But not always. Addresses are probably poorly named for what they actually are (but great name for ux). They're basically part template to describe where to find corresponding signature when spent, and part fingerprint to correlate to a sensitive signing material. Sometimes you aren't using a typical way of storing the pubkey. Sometimes you aren't using a typical way to correlate signing material.

I see the block reward is passed to 196 addresses. Does the error appear because the block reward passed not to one but to many?

not at all, that's irrelevant. It's because the output script

OP_PUSHBYTES_36 7aa02f855906258a82089dc02b13d600ce8636d2d34ec5804556859a98c7dbef02000100

is not a specific pattern. I mean look at this, it's not even a lock. I don't know what this bit of data is, but i think anyone could spend this utxo. This is further confirmed by the input script in the corresponding spending tx

OP_PUSHBYTES_1 01

because there's no real lock in the output script, all the input script has to do is toss a 1 (aka true) so the script evaluates as success and the utxo is spendable. honestly totally goofy, i don't know why someone did this. the value of the utxo (and of most of the inputs in the spening tx) is also a flat 0, which you don't really see often.

pputnik commented 6 months ago

Thank you for the explanation.

Whilst the library author is away - could you please share your fix, either in the form of PR or not?

Thank you very much.

arshbot commented 6 months ago

could you please share your fix, either in the form of PR or not?

PR is probably best, but I won't go to the effort unless the maintainer ACKs the approach.

The problem is two-fold:

mccwdev commented 6 months ago

@arshbot, thanks for you clear explanation and analysing the problem.

I noticed, if you disable caching the transaction is correctly parsed (the incorrect output is ignored). But trying to access the address results in an exception.

The problem is indeed that the Output, Key and Address objects do not support unknown scripts and should probably ignore them if a strict=False option is set. If you have some ideas and would like to fix these issues with a PR, you are more then welcome.

The reason for the strict option, which defaults to True, is to avoid creation of incorrect transactions or output scripts that can result in loss of funds.

pputnik commented 6 months ago

@arshbot, please ^_^