ZuInnoTe / hadoopcryptoledger

Hadoop Crypto Ledger - Analyzing CryptoLedgers, such as Bitcoin Blockchain, on Big Data platforms, such as Hadoop/Spark/Flink/Hive
Apache License 2.0
141 stars 51 forks source link

Bitcoin: Support rev*.dat files #83

Open jornfranke opened 3 years ago

jornfranke commented 3 years ago

At the moment, we process only blk.dat files which contain all the information necessary. Additionally, we can process rev.dat files (see https://bitcoin.stackexchange.com/questions/57978/file-format-rev-dat what they are about). From the webpage

4 bytes: network magic (0xf9,0xbe,0xb4,0xd9) 4 bytes: size of the CBlockUndo record (LE32) data_size bytes: CBlockUndo record 32 bytes: double-SHA256 of the serialized CBlockUndo record

A CBlockUndo record consists of a serialized vector of CTxUndo records, one for each transaction in the block excluding the coinbase transaction. Vector serialization first writes a CompactSize-encoded length of the number of records (the transaction count - 1, in this case), and then serialized all the records themselves sequentially.

A CTxUndo record consists of a serialized vector of CTxInUndo records, one for each input in the transaction.

A CTxInUndo record consists of:

varint: 2*height (+1 if it was a coinbase output): the height of the block that created the spent UTXO varint: creating transaction's version [only when height > 0] CompressedScript: spent UTXO's scriptPubKey CompressedAmount: spent UTXO's nValue

Until Bitcoin Core 0.14.x, the height is zero for all but the last output of a given transaction being spent. In Bitcoin Core 0.15 (to be released soon), it will be present for every spend.

For more information about the detailed encodings, see the comments in the Bitcoin Core source code: CompactSize (https://github.com/bitcoin/bitcoin/blob/v0.14.2/src/serialize.h#L202), VarInt (https://github.com/bitcoin/bitcoin/blob/v0.14.2/src/serialize.h#L277), CompressedScript (https://github.com/bitcoin/bitcoin/blob/v0.14.2/src/compressor.h#L17), and CompressedAmount (https://github.com/bitcoin/bitcoin/blob/v0.14.2/src/compressor.cpp#L133).