Open whyrusleeping opened 7 years ago
also cc @kumavis
Once all the above is done, building ipfs with ethereum support would still require running a custom binary. I'm really wanting to add support for plugins to ipfs, and that would nicely solve the issue. You could just build this package as a plugin, put it in the ipfs/plugins directory, and bam! you have the ability to traverse ethereum dags
yeah i agree plugins would be awesome - look forward to that
I only did block, transaction, and transaction trie parsing. Working on support for state trie processing will be nice.
currently eth-ipfs bridge only serves blocks, transactions (no tries), state and storage tries, no tx receipts or no tx receipt tries. These are missing b/c parity does not index them by hash (tx receipt) or doesn't store them at all (tx trie, tx receipt tree).
gx
tools to update dependencies.ethapi
's BlockByHeader
. We need to work with eth-block
as a block header. Added those notes.ethapi
to give us from geth
all the data we need. One repo with the extracting CLI, and another one with a patch to geth
will be developed in parallel.eth-block
to be a block header refactor/block-header
.feat/state-trie
.go-ipld-eth-dump-star
, which will be a CLI app where you provide a block hash, and this program will recursively retrieve and add to IPFS the block header, ommer list, state trie elements, etc; Until complete the 8 IPLD types. This should be the angular stone to the ambitious project of having the whole eth blockchain on IPFS.
go-ethereum
's ethapi
package, to retrieve whatever we need.
[0x90] eth-block
ommers: eth-block-list
txs: eth-tx-trie
receipts: eth-tx-receipt-trie
state: eth-state-trie
[0x91] eth-tx (local data only) [0x92] eth-tx-receipt (local data only) [0x93] eth-account-snapshot (local data only)
[0x94] eth-block-list (rlp array)
[0x95] eth-tx-trie (merkle trie)
[0x96] eth-tx-receipt-trie (merkle trie) leaves: links to eth-tx-receipt
[0x97] eth-state-trie (secure merkle trie) leaves: links to eth-account-snapshot
[0x98] eth-storage-trie (secure merkle trie) leaves: links to raw binary
Paging @Kubuxu @Stebalien as per @whyrusleeping advice.
And linking to our evil world domination plan repo https://github.com/MetaMask/eth-ipfs-browser-client/issues/1
Then in an init function, it needs to register itself in the decoders map.
FYI, we've stopped doing this. Instead, just call Register(codec, decoder)
at some point before trying to decode an eth block (this makes it easier to register eth decoders from plugins).
@Stebalien It's already done by Why. See https://github.com/ipfs/go-ipld-eth/blob/master/plugin/eth.go#L36-L38
Ah, ok. Just wanted to make sure there wasn't out-of-date information floating around.
@Stebalien please, if you have some time, take a look at the attempt of documenting each public function PR, so golint
stopped nagging me 😉. Most of the comments are ~stolen~ borrowed from go-ipld-format interface.
I am building from this first PR.
Moving forward with this. A big refactor was already done here in #5. Some time will be spent on working on an importer. We can use the material in the plugin's directory README to make a blog post in the future.
Got some interesting data on my first attempt to import
The importing performance should improve with a truckload of cheap machines (or research `amazon's lambda`s maybe?) and a shared stack. Redis comes to my mind.
herman
[07:15]
Finally. One answer to a question
[07:15]
2017/08/07 07:11:10 From the stack: 0xc7041743ad5152d8d13815ca6be379ff3b4c994069cc419867ab0d890d460b5f
2017/08/07 07:11:10 z45oqTS7yKVxeLJE8H1Q5o8nTusiARceKKt7hMkbED8PDeaCHQ2
2017/08/07 07:11:10 This is a leaf
2017/08/07 07:11:10 Node imported. Count = 12352
2017/08/07 07:11:10 From the stack: 0x84269463e5e9ecf08491d8745b98cec308498076c2cacbbe1c6e7adbe5d00438
2017/08/07 07:11:10 z45oqTS3UJgmLqbXEdANJGbbHKTHJcdZhvhkkrsoD6XL2A4dftb
2017/08/07 07:11:10 Adding 0xef6d2178835239b85ea68f9b3c2201ee49daf3744ebeb48901cc9374d9b97b9d (idx: b) to the stack
2017/08/07 07:11:10 Adding 0x06214d858b09063e9efe886d4f634348a7845a729807472bec1dbb26c40ac136 (idx: 5) to the stack
2017/08/07 07:11:10 Node imported. Count = 12353
2017/08/07 07:11:10 From the stack: 0x06214d858b09063e9efe886d4f634348a7845a729807472bec1dbb26c40ac136
2017/08/07 07:11:10 z45oqTRtzNeddu43X6Xvt8SBFmtVxukPrPZeBe4tiGNZYCeHf7K
2017/08/07 07:11:11 This is a leaf
2017/08/07 07:11:11 Node imported. Count = 12354
2017/08/07 07:11:11 From the stack: 0xef6d2178835239b85ea68f9b3c2201ee49daf3744ebeb48901cc9374d9b97b9d
2017/08/07 07:11:11 z45oqTSAh4htdRWX3DNXP1Ze2sQJ55UrukYNYoSMVitNXeY4P9n
2017/08/07 07:11:11 This is a leaf
2017/08/07 07:11:11 Node imported. Count = 12355
2017/08/07 07:11:11 From the stack: 0x1a202509db353cf86ea03dc0a9864a2c40af91e8bd28c1dc8ac56818824ed638
2017/08/07 07:11:11 z45oqTRvLRn9vW9u7VeEE7rqtrr4jz8ks5zzDSggNkF8BGgW9My
2017/08/07 07:11:11 This is a leaf
2017/08/07 07:11:11 Node imported. Count = 12356
Stack Empty. We are done here :D
[07:15]
Genesis Block has `8,892` accounts (see
https://github.com/ethereum/pyethsaletool/blob/master/genesis_block.json)
[07:16]
And `12'356` state trie nodes (took from `06:27:09` to `07:11:11` to traverse them all.
Tunneling to the source Me in Chile, `mantis` in `Azure East 2` ) (edited)
[07:17]
Etherscan says that the latest block (`#4127835`) has `5,270,884`
https://etherscan.io/accounts
herman
[07:31]
So, a näive download at this rate to the latest block should take `435` hours ->
https://www.wolframalpha.com/input/?i=(06:27:09+to+07:11:11)+*+(5270327%2F8892)
[07:34]
Now.
1) We will do the retrieval from a local machine with respected to the parity server.
2) as you get more blocks, the odds of "repeating" trie nodes increase
(That's the whole point of using a state trie).
3) We have to figure out a way to parallelize this process
(as stated above, several machines or lambdas, plus a common stack in redis, for example). or
4) We can do the initial job, with just plugin to an inactive levelDB for earlier blocks,
and then using the API for the latest blocks.
[07:35]
Anyways. We will figure out something, as always. At last we have numbers to start with!
There should also be a major perf improvement if you switch your IPFS node from flatfs to bager (still WIP) right now but what you can do is:
--dht=none
option for the daemon for initial addNoSync
option in the configOK. #7 is the second (and hopefully last) heavy overhaul. Now we can talk about organic growth, continuous improvements and the such.
Current focus is making a fast and decent importer (https://github.com/hermanjunge/go-ipld-eth-import, to be someday gave away to ipfs
) for the eth-state-trie
elements.
[0x96]
- eth-state-trie
. Support input for RLP encoded state trie elements.
go-ipld-eth-import
.[0x97]
- eth-account-snapshot
[0x95]
- eth-tx-receipt
:
eth-tx-receipt-trie
([0x96]
) leaves, and the eth-tx-receipt
objects.The rest of the IPLD ETH Types:
[0x91]
- eth-block-list
[0x98]
- eth-storage-trie
This one PR moves the needle to the right.
We have a pretty decent doc to make a huge blog post on this! Pinging @whyrusleeping as you requested this.
DevP2P
grid, taking the latest updates in the blockchain and importing them into the local IPFS node.[0x95]
- eth-tx-receipt
:
eth-tx-receipt-trie
([0x96]
) leaves, and the eth-tx-receipt
objects.The rest of the IPLD ETH Types:
[0x91]
- eth-block-list
[0x98]
- eth-storage-trie
<block-cid>/address/<eth-address>/balance
to be <block-cid>/root/<keccak256(eth-address)/balance
.<block-cid>/txs/<tx-id>/nonce
to be <block-cid>/tx/<rlp(tx-id)/nonce
This is a write up on IPFS/notes I made the other day.
@dryajov
[ ] The rest of the IPLD ETH Types:
[0x91]
- eth-block-list
[0x98]
- eth-storage-trie
[ ] Fix eth-account-snapshot
to reference:
[0x98]
- Its storage root (an eth-storage-trie
)[0x55]
- A raw keccak256 hash referencing the EVM Code.[ ] [0x95]
- eth-tx-receipt
:
eth-tx-receipt-trie
([0x96]
) leaves, and the eth-tx-receipt
objects.~### EVM Code codec~
~Following PRs should be approved to include 0x99
codec here. Please give them a close following, as they involve a practical discussion whether it makes sense to add a new codec, or if we stick to 0x55
(raw data), as the EVM code has no structure.~
~ https://github.com/multiformats/multicodec/pull/61~ ~ https://github.com/ipfs/go-cid/pull/37~
@hermanjunge How did you fetch the state trie rlp data that's in the test data directory? I see that you mentioned using the parity ipfs api - how did you determine the cid to pass in? Did you use this tool? If so, did you generate the eth-state-trie cid from a block hash, a state root hash, or something different?
Did you use this tool?
That's correct, https://github.com/kumavis/eth-ipld-cli
If so, did you generate the eth-state-trie cid from a block hash, a state root hash, or something different?
The root state trie can be obtained from the block header. Succesive trie hashes are obtained when you retrieve this first element from a database (i.e. The ipfs-parity API), and then continue traversing. To know the traversal path, you need to hash (keccak-256) the value of the ethereum address. There is section documenting this example of performing the former operation manually with the ipfs client and the plugin in this repository. You can even find code to create the hash in that section.
Hope this answers your question.
Thanks for the quick reply, @hermanjunge! That example is really helpful, and I'm super excited to see where this project goes/potentially contribute.
Quick follow up - the example works for fetching the state root of the genesis block (and for traversing to accounts from there). Do you know whether it's possible to perform similar operations on subsequent blocks?
For example, with the genesis block, I know that the cid for the header is z43AaGF73rnZ14vjAkMQ8xoNfBShmq8qaiqFuELAx1vxSTzfGY2
and the cid for the root is z45oqTS97WG4WsMjquajJ8PB9Ubt3ks7rGmo14P5XWjnPL7LHDM
, and I can traverse downward to learn information about accounts from there.
However, for block 5,000,000, it appears that the cid for the header is z43AaGF1A8G45wosbcDDkCMWyNt5FfWc1UMM3EzrdS9ZTGN419B
and the cid for the root is z45oqTS15RnXKjQMUS4gtmpJJzeuKeYLE2yw1pdi98NUxCH6YZi
, but the parity ipfs api call of http://localhost:5001/api/v0/block/get?arg=z45oqTS15RnXKjQMUS4gtmpJJzeuKeYLE2yw1pdi98NUxCH6YZi
yields an error of State root not found
(at least for me). Any idea what might be happening here?
Is highly probable that your parity client has pruned that state from the database, or have not even obtained that element from its synchronization. You may want to try with a latter block and state trie.
I checked with my running server and failed for block 5,000,000
. However, for a recent block (5,614,095
), I got success. Here,
https://etherscan.io/block/5614095
gave me its hash 0x536c2a4cf78f03268dc7f2bac2e5ce541d13fad0179891c47cd6825cedcb5829
eth-ipld cid 0x536c2a4cf78f03268dc7f2bac2e5ce541d13fad0179891c47cd6825cedcb5829
# gives "ethBlock": "z43AaGExLSyBxdzcVdwGtC4X3Ydf2ftKYzQsyr1W6MioA3cZT4c",
curl --output - http://localhost:5001/api/v0/block/get?arg=z43AaGExLSyBxdzcVdwGtC4X3Ydf2ftKYzQsyr1W6MioA3cZT4c | eth-ipld block
# and I am able to get the stateRoot
# "stateRoot": "0x9c3e5ae1dcdbfcde4d804a4e54e793c8ac6328151d2dcf95438df04d98fe9703",
# which I convert to cid
# eth-ipld cid 0x9c3e5ae1dcdbfcde4d804a4e54e793c8ac6328151d2dcf95438df04d98fe9703
# then
curl --output - http://localhost:5001/api/v0/block/get?arg=z45oqTS56MVrDkBQoQFs5mcxLHty2msTu2D3cJfdZfFsirxKnaN | eth-ipld rlp
# gives
[
"0c416261069b8763ea27d6eafb97351e511c951ac8b6eeed5ecd02a59a85e080",
"dd5f3eaef4a1aa058a7da097c28be246d812d0c921ee2eb4ced6a8088e34e723",
...
"8771e32a2fabd77211b95bd12e1b67db6755601ae046da94a07dc983c7300bd4",
""
]
Hi Herman!
I was able to run:
curl --output - http://localhost:5001/api/v0/block/get?arg=z45oqTS56MVrDkBQoQFs5mcxLHty2msTu2D3cJfdZfFsirxKnaN | eth-ipld node
and get this result:
{ "type": "branch", "children": { "0": "0c416261069b8763ea27d6eafb97351e511c951ac8b6eeed5ecd02a59a85e080", "1": "dd5f3eaef4a1aa058a7da097c28be246d812d0c921ee2eb4ced6a8088e34e723", "2": "ecab2131db994982a26b540241fc4e7710b2aa1301383794a2ba12ff3200d5f4", "3": "cc486e899be905efdfbea3cd6b66d16e06e6c71759859d957eab69979eff875f", "4": "2834b562daf7e045516c2c85bd60a42ff4bcbc729efb240a6245e99a2c126f5f", "5": "d1f0e775c71bc99cc1db69ac4275283693bf7d701b8ea7db02c72e0a46b97405", "6": "8c5b2b89a8eb9488507c057a43a068cbda6ec937ecca32b29823e78d67dbe977", "7": "57d2db06cb923f043d1e16b170cb14a35b2efc96f272fd29c39e546d44b881ac", "8": "ebbbb6b1320a5ff55b2d8c35e52b810cd1eb4f90f0ef8f87d6fbaf0580ad950e", "9": "e659edc1f12beb959cf1405d6a4d8e669c77b109cfe4f4a56c7b748094c878ac", "a": "9c2bea51084f610a779574f0cf23f4f8e406766fe1d845078ce95e370b06aa02", "b": "01d04ace33310b608c1c751b6775c1ab91041efab45a5a91c817c29165c50bee", "c": "cf7d5c9c8b86721a18700afef55d316aed43635d955a631f490729237a27f168", "d": "fcf1c49c0585961b3e03f0fcecdb3f2cc23f3a9c796973f7b27a160841937500", "e": "71338d7803e20cf47d411e9ba0a6594492d62c656c5a53b16f825b934c6bcdce", "f": "8771e32a2fabd77211b95bd12e1b67db6755601ae046da94a07dc983c7300bd4" }, "value": "0x" }
But when I ran it with eth-ipld block
I get this error:
Error: wrong number of fields in data at Object.exports.defineProperties (/usr/local/lib/node_modules/eth-ipld/node_modules/ethereumjs-util/dist/index.js:698:15) at new module.exports (/usr/local/lib/node_modules/eth-ipld/node_modules/ethereumjs-block/header.js:79:9) at getStdin.buffer.then (/usr/local/lib/node_modules/eth-ipld/commands/block.js:31:20) at process._tickCallback (internal/process/next_tick.js:109:7)
could you explain what's going on? Thanks!
eth-ipld block
processes the RLP of a block header
Yes. your line (with the typo fixed):
curl --output - http://localhost:5001/api/v0/block/get?arg=z45oqTS56MVrDkBQoQFs5mcxLHty2msTu2D3cJfdZfFsirxKnaN | eth-ipld block
returns:
% Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 532 0 532 0 0 6897 0 --:--:-- --:--:-- --:--:-- 7000 Error: wrong number of fields in data at Object.exports.defineProperties (/usr/local/lib/node_modules/eth-ipld/node_modules/ethereumjs-util/dist/index.js:698:15) at new module.exports (/usr/local/lib/node_modules/eth-ipld/node_modules/ethereumjs-block/header.js:79:9) at getStdin.buffer.then (/usr/local/lib/node_modules/eth-ipld/commands/block.js:31:20) at process._tickCallback (internal/process/next_tick.js:109:7)
Not:
[ "0c416261069b8763ea27d6eafb97351e511c951ac8b6eeed5ecd02a59a85e080", "dd5f3eaef4a1aa058a7da097c28be246d812d0c921ee2eb4ced6a8088e34e723", ... "8771e32a2fabd77211b95bd12e1b67db6755601ae046da94a07dc983c7300bd4", "" ]
Are we using different versions of eth-ipld block
?
Thanks again.
You are right @AFDudley , I checked my ./bash_history
. Bad copy-pasta. I meant eth-ipld rlp
. Apologies. Typo corrected above.
Thanks, I was able to replicate that with a more recent block.
To get this integrated more officially into go-ipfs, we will need to first clean the code here up a little (make sure it complies with golint and vetting tools) and then make it confirm to the newer go-ipld-format plugin semantics (it needs a DecodeBlock method matching this: https://github.com/ipfs/go-ipld-format/blob/master/coding.go#L13) Then in an init function, it needs to register itself in the decoders map.
Then, any build of go-ipfs that imports this package will automatically be able to handle ethereum types.
There are also a few changes from https://github.com/ipfs/go-ipfs/compare/feat/zcash that we will need to get merged (primarily the changes to the
ipfs dag put
command that allows hex input).And finally, this package doesnt implement handling for all the different ethereum object types. I only did block, transaction, and transaction trie parsing. Working on support for state trie processing will be nice.
cc @hermanjunge