blockchain-etl / bitcoin-etl

ETL scripts for Bitcoin, Litecoin, Dash, Zcash, Doge, Bitcoin Cash. Available in Google BigQuery https://goo.gl/oY5BCQ
https://twitter.com/BlockchainETL
MIT License
410 stars 123 forks source link

About nonstandard output #43

Open SuperLSH opened 4 years ago

SuperLSH commented 4 years ago

@medvedev1088 Hi, I found that there ara two kinds of output in transactions.json. One is nonstandard, and the other is pubkeyhash. I thought the nonstandard output is the op_return output, but i found outputs of many (not all ) coinbase txs also are nonstandard. And address of miner is like“nonstandard3318537dfb3135df9f3d950dbdf8a7ae68dd7c7d”. Here is an example: transaction: { "hash": "4a5e1e4baab89f3a32518a88c31bc87f618f76673e2cc77ab2127b7afdeda33b", "block_number": 0, "block_hash": "000000000019d6689c085ae165831e934ff763ae46a2a6c172b3f1b60a8ce26f", "is_coinbase": true, "outputs": [{"index": 0, "script_asm": "04678afdb0fe5548271967f1a67130b7105cd6a828e03909a67962e0ea1f61deb649f6bc3f4cef38c4f35504e51ec112de5c384df7ba0b8d578a4c702b6bf11d5f OP_CHECKSIG", "script_hex": "4104678afdb0fe5548271967f1a67130b7105cd6a828e03909a67962e0ea1f61deb649f6bc3f4cef38c4f35504e51ec112de5c384df7ba0b8d578a4c702b6bf11d5fac", "required_signatures": null, "type": "nonstandard", "addresses": ["nonstandard3318537dfb3135df9f3d950dbdf8a7ae68dd7c7d"], "value": 5000000000}], "input_count": 0, "output_count": 1, "input_value": 0, "output_value": 5000000000, "fee": 0} The address above is 1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa according to [https://www.blockchain.com/btc/tx/4a5e1e4baab89f3a32518a88c31bc87f618f76673e2cc77ab2127b7afdeda33b]

I'm confused. Perhaps you know why? Thank you very much!!!!

medvedev1088 commented 4 years ago

That's interesting. What version of bitcoin-cli are you using? bitcoin-etl uses the types and addresses returned by the JSON RPC api so I'd check what the api returns for this tx.

SuperLSH commented 4 years ago

$ bitcoin-cli -version Bitcoin Core RPC client version v0.19.99.0-41fa292

SuperLSH commented 4 years ago

[sihao@localhost ~]$ bitcoin-cli getrawtransaction 792c4a613e51966fc1c852cd8d94fd10a05efa55fffdd6a41281c8d139540b23 1 { "txid": "792c4a613e51966fc1c852cd8d94fd10a05efa55fffdd6a41281c8d139540b23", "hash": "792c4a613e51966fc1c852cd8d94fd10a05efa55fffdd6a41281c8d139540b23", "version": 1, "size": 134, "vsize": 134, "weight": 536, "locktime": 0, "vin": [ { "coinbase": "0415112a1c010f", "sequence": 4294967295 } ], "vout": [ { "value": 50.00000000, "n": 0, "scriptPubKey": { "asm": "042f7edc4ff409281bdbd3cdca6ae365e10d0473b527025473fafccb1bd87f43855ddcde9b662fae2ca306b694a74923ae7055e8e60d10e8af68c705e923f86372 OP_CHECKSIG", "hex": "41042f7edc4ff409281bdbd3cdca6ae365e10d0473b527025473fafccb1bd87f43855ddcde9b662fae2ca306b694a74923ae7055e8e60d10e8af68c705e923f86372ac", "type": "pubkey" } } ], "hex": "01000000010000000000000000000000000000000000000000000000000000000000000000ffffffff070415112a1c010fffffffff0100f2052a010000004341042f7edc4ff409281bdbd3cdca6ae365e10d0473b527025473fafccb1bd87f43855ddcde9b662fae2ca306b694a74923ae7055e8e60d10e8af68c705e923f86372ac00000000", "blockhash": "00000000237ce1b692f5ef9f1f9ac64d62d5e6a24dcfec08633fa2d339434e2a", "confirmations": 578264, "time": 1270917916, "blocktime": 1270917916 } It turns out that scripts of the "pubkey" type are parsed to nonstandard. If I change the bitcoin-cli version to 0.19.0, according to your experience, will the result be correct?

@medvedev1088 I'm sorry to trouble you so much. Looking forward to your reply.

medvedev1088 commented 4 years ago

That's unlikely. We don't parse the scripts in bitcoin-etl, we only use the data that bitcoin-cli provides. I don't see the address 1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa anywhere in the JSON output.

SuperLSH commented 4 years ago

Thank you!!!!!!! I see.

tokusyu commented 4 years ago

@medvedev1088 Do you have a plan parsing the script to extract bitcoin address from the public key? For me, both are fine, but what I only need is preserving "type" value even if "address" is not in response. I'm planning to support another coin that has multiple "type" values without "address" in their responses.

medvedev1088 commented 4 years ago

@tokusyu preserving the "type" field sounds like a good idea. Parsing scripts can also be done in this tool, if can find a good Python library or wrapper that already implemented script parsing.

jingli85 commented 3 years ago

@SuperLSH @medvedev1088 @allenday I looked into a similar case mentioned here.

Transaction Hash: 03453deedd0d3051a7752e2b04c58a907c5ff5cb59a7e1fbabba0bf479fa1b66. The first output of this transaction has the following script: 020e46e79a2a8d12b9b5d12c7a91adb4e454edfae43c0a0cb805427d2ac7613fd9 OP_CHECKSIG

Here are the result from several Bitcoin Blockchain Explorers:

Both Blockchain.com and BlockChair.com correctly parse the first output as P2PKH with address 1P3rU1Nk1pmc2BiWC8dEy9bZa1ZbMp5jfg and value 0.00098800 BTC.

However, the Google BigQuery as well as BTC.com are unable to parse it correctly. BTC.com shows “unable to decode output address”. Google BigQuery parse it to nonstandardf4acdc49b8af461468a5e65549db65f9b294c9fe. GBQ uses this bitcoin-etl tool to etl from bitcoin core.

I suspect that the reason why cli sets the address to nonstandard is that the script for this transaction does not conform to the P2PKH standard format. For P2PKH script, the spec should be

The script should start with OP_DUP OP_HASH160 and has OP_EQUALVERIFY before OP_CHECKSIG. For example, OP_DUP OP_HASH160 000012a6ddb2584c883aeab296d6a08419e0cc71 OP_EQUALVERIFY OP_CHECKSIG.

However, the script in this cases does not contain OP_DUP OP_HASH160 nor OP_EQUALVERIFY, i.e., 020e46e79a2a8d12b9b5d12c7a91adb4e454edfae43c0a0cb805427d2ac7613fd9 OP_CHECKSIG. The cli therefore does not know how to handle such script and set the address to nonstandard.

Any plan and timeline to enhance and solve this issue? Thanks.

neelriyer commented 3 years ago

Hey @medvedev1088 , can I get some help on this issue?

I'm thinking about using a library like btcpy to implement script parsing. Is this a good approach?

Here's my intial code. Let me know what you think.

medvedev1088 commented 3 years ago

Sorry quite busy nowadays.

Will try to look into it as soon as I have time.

neelriyer commented 3 years ago

@jingli85, @SuperLSH, @tokusyu Are you still working on this? What kind of progress did you make?

tokusyu commented 3 years ago

@spiyer99 unfortunately no progress. using a library like btcpy is interesting, we need to consider coins other than bitcoin that have their prefix code eg. https://bitcoin.stackexchange.com/questions/62781/litecoin-constants-and-prefixes.

neelriyer commented 3 years ago

@tokusyu, @medvedev1088 after looking at the script asm and script hex for the nonstandard addresses I’m seeing a lot of them are prefixed with OP_RETURN. According to this source (​​https://en.bitcoin.it/wiki/OP_RETURN) these would correspond to invalid transactions. Does that mean we can safely disregard all nonstandard addresses?

akhorshidi commented 2 years ago

I'm interested in implementing such feature. So, if it's desired, I can make a PR. I was thinking of something similar to this I've posted here.