dennis-tra / nebula

🌌 A network agnostic DHT crawler, monitor, and measurement tool that exposes timely information about DHT networks.
Apache License 2.0
294 stars 30 forks source link

Support crawler substrate on polkadot-v0.9.41 #40

Closed duythien closed 9 months ago

duythien commented 1 year ago

Hello

Currently, I have working on substrate here is node https://polkadot.js.org/apps/?rpc=wss%3A%2F%2Fdev.gsviec.com#/explorer/node

image

And try to add code

BootstrapPeersCustom = []string{
 "/ip4/18.223.218.133/tcp/30333/p2p/12D3KooWSjZH2ETsHsPsfqFuCjbtsEDFrT9L8VeG647iTyMJ3aiC",
}

Then the result after running

{"PeerID":"12D3KooWSjZH2ETsHsPsfqFuCjbtsEDFrT9L8VeG647iTyMJ3aiC","Maddrs":["/ip4/18.223.218.133/tcp/30333"],"Protocols":["/1ddce31fac84071085205b3a01888178bba5b08cb30a668a732c9d13939081c5/transactions/1","/ipfs/ping/1.0.0","/1ddce31fac84071085205b3a01888178bba5b08cb30a668a732c9d13939081c5/kad","/sup/kad","/1ddce31fac84071085205b3a01888178bba5b08cb30a668a732c9d13939081c5/sync/2","/1ddce31fac84071085205b3a01888178bba5b08cb30a668a732c9d13939081c5/sync/warp","/sup/sync/warp","/sup/transactions/1","/1ddce31fac84071085205b3a01888178bba5b08cb30a668a732c9d13939081c5/grandpa/1","/1ddce31fac84071085205b3a01888178bba5b08cb30a668a732c9d13939081c5/state/2","/sup/block-announces/1","/paritytech/grandpa/1","/ipfs/id/1.0.0","/ipfs/id/push/1.0.0","/1ddce31fac84071085205b3a01888178bba5b08cb30a668a732c9d13939081c5/block-announces/1","/sup/sync/2","/sup/state/2","/1ddce31fac84071085205b3a01888178bba5b08cb30a668a732c9d13939081c5/light/2","/sup/light/2"],"AgentVersion":"Substrate Node/v4.0.0-dev-a2bd00b67e2 (Node-Thien)","ConnectDuration":"1.376036919s","CrawlDuration":"1.903481279s","VisitStartedAt":"2023-09-05T09:49:30.632198132Z","VisitEndedAt":"2023-09-05T09:49:32.535679338Z","ConnectErrorStr":"","CrawlErrorStr":"unknown","IsExposed":null}

But we have more then peers that the result is correct ?

Note

time="2023-09-05T09:49:30Z" level=info msg="Starting Nebula crawler..."
time="2023-09-05T09:49:30Z" level=info msg="Initializing JSON client" out=a
time="2023-09-05T09:49:30Z" level=info msg="Starting to crawl the network"
time="2023-09-05T09:49:30Z" level=info msg="Initializing crawl..."
time="2023-09-05T09:49:32Z" level=info msg="Handled crawl result from worker crawler-14" crawled=1 crawlerID=crawler-14 error="getting closest peer with CPL 8: protocols not supported: [/sub/kad]" inCrawlQueue=0 isDialable=false remoteID=12D3KooWSjZH2ETs
time="2023-09-05T09:49:32Z" level=info msg="Persisted result from worker crawler-14" duration="563.683µs" persisted=1 persisterID=persister-07 remoteID=12D3KooWSjZH2ETs success=true
time="2023-09-05T09:49:32Z" level=info msg="Waiting for persister to stop" persisterID=persister-01
time="2023-09-05T09:49:32Z" level=info msg="Waiting for persister to stop" persisterID=persister-03
time="2023-09-05T09:49:32Z" level=info msg="Waiting for persister to stop" persisterID=persister-05
time="2023-09-05T09:49:32Z" level=info msg="Waiting for persister to stop" persisterID=persister-07
time="2023-09-05T09:49:32Z" level=info msg="Waiting for persister to stop" persisterID=persister-09
time="2023-09-05T09:49:32Z" level=info msg="Waiting for persister to stop" persisterID=persister-11
time="2023-09-05T09:49:32Z" level=info msg="Waiting for persister to stop" persisterID=persister-13
time="2023-09-05T09:49:32Z" level=info msg="Waiting for persister to stop" persisterID=persister-15
time="2023-09-05T09:49:32Z" level=info msg="Waiting for persister to stop" persisterID=persister-17
time="2023-09-05T09:49:32Z" level=info msg="Waiting for persister to stop" persisterID=persister-19
time="2023-09-05T09:49:32Z" level=info msg="Persisting crawl result..."
time="2023-09-05T09:49:32Z" level=info msg="Persisting crawl properties..."
time="2023-09-05T09:49:32Z" level=info msg="Logging crawl results..."
time="2023-09-05T09:49:32Z" level=info
time="2023-09-05T09:49:32Z" level=info
time="2023-09-05T09:49:32Z" level=info msg=Agent count=1 value="Substrate Node/v4.0.0-dev-a2bd00b67e2 (Node-Thien)"
time="2023-09-05T09:49:32Z" level=info
time="2023-09-05T09:49:32Z" level=info msg=Protocol count=1 value=/1ddce31fac84071085205b3a01888178bba5b08cb30a668a732c9d13939081c5/state/2
time="2023-09-05T09:49:32Z" level=info msg=Protocol count=1 value=/paritytech/grandpa/1
time="2023-09-05T09:49:32Z" level=info msg=Protocol count=1 value=/ipfs/id/push/1.0.0
time="2023-09-05T09:49:32Z" level=info msg=Protocol count=1 value=/1ddce31fac84071085205b3a01888178bba5b08cb30a668a732c9d13939081c5/kad
time="2023-09-05T09:49:32Z" level=info msg=Protocol count=1 value=/sup/kad
time="2023-09-05T09:49:32Z" level=info msg=Protocol count=1 value=/1ddce31fac84071085205b3a01888178bba5b08cb30a668a732c9d13939081c5/sync/2
time="2023-09-05T09:49:32Z" level=info msg=Protocol count=1 value=/1ddce31fac84071085205b3a01888178bba5b08cb30a668a732c9d13939081c5/sync/warp
time="2023-09-05T09:49:32Z" level=info msg=Protocol count=1 value=/sup/sync/warp
time="2023-09-05T09:49:32Z" level=info msg=Protocol count=1 value=/sup/transactions/1
time="2023-09-05T09:49:32Z" level=info msg=Protocol count=1 value=/sup/block-announces/1
time="2023-09-05T09:49:32Z" level=info msg=Protocol count=1 value=/sup/sync/2
time="2023-09-05T09:49:32Z" level=info msg=Protocol count=1 value=/1ddce31fac84071085205b3a01888178bba5b08cb30a668a732c9d13939081c5/block-announces/1
time="2023-09-05T09:49:32Z" level=info msg=Protocol count=1 value=/sup/state/2
time="2023-09-05T09:49:32Z" level=info msg=Protocol count=1 value=/1ddce31fac84071085205b3a01888178bba5b08cb30a668a732c9d13939081c5/transactions/1
time="2023-09-05T09:49:32Z" level=info msg=Protocol count=1 value=/ipfs/ping/1.0.0
time="2023-09-05T09:49:32Z" level=info msg=Protocol count=1 value=/1ddce31fac84071085205b3a01888178bba5b08cb30a668a732c9d13939081c5/grandpa/1
time="2023-09-05T09:49:32Z" level=info msg=Protocol count=1 value=/ipfs/id/1.0.0
time="2023-09-05T09:49:32Z" level=info msg=Protocol count=1 value=/1ddce31fac84071085205b3a01888178bba5b08cb30a668a732c9d13939081c5/light/2
time="2023-09-05T09:49:32Z" level=info msg=Protocol count=1 value=/sup/light/2
time="2023-09-05T09:49:32Z" level=info
time="2023-09-05T09:49:32Z" level=info msg="Finished crawl" crawlDuration=1.910313629s crawledPeers=1 dialablePeers=1 undialablePeers=0
dennis-tra commented 1 year ago

Since it's a different network, you also need to set the correct DHT protocol ID. From your log output I assumed the correct protocol ID is /sup/kad (kad is often short for Kademlia). By running:

nebula crawl --bootstrap-peers /ip4/18.223.218.133/tcp/30333/p2p/12D3KooWSjZH2ETsHsPsfqFuCjbtsEDFrT9L8VeG647iTyMJ3aiC --protocols /sup/kad --json-out .

I get the following output:

INFO[0000] Starting Nebula crawler...
INFO[0000] Initializing JSON client                      out=.
INFO[0000] Starting to crawl the network
INFO[0000] Initializing crawl...
INFO[0000] Persisted result from worker crawler-02       duration="588.084µs" persisted=1 persisterID=persister-03 remoteID=12D3KooWSjZH2ETs success=true
INFO[0000] Handled crawl result from worker crawler-02   crawled=1 crawlerID=crawler-02 inCrawlQueue=2 isDialable=true remoteID=12D3KooWSjZH2ETs
INFO[0001] Handled crawl result from worker crawler-07   crawled=2 crawlerID=crawler-07 inCrawlQueue=1 isDialable=true remoteID=12D3KooWJDkRvwdD
INFO[0001] Persisted result from worker crawler-07       duration="165.25µs" persisted=1 persisterID=persister-05 remoteID=12D3KooWJDkRvwdD success=true
INFO[0002] Persisted result from worker crawler-03       duration="185.833µs" persisted=1 persisterID=persister-07 remoteID=12D3KooWMYjPxNHc success=true
INFO[0002] Handled crawl result from worker crawler-03   crawled=3 crawlerID=crawler-03 inCrawlQueue=15 isDialable=true remoteID=12D3KooWMYjPxNHc
INFO[0002] Handled crawl result from worker crawler-01   crawled=4 crawlerID=crawler-01 dialErr=no_public_ip inCrawlQueue=14 isDialable=false remoteID=12D3KooWCFhiL2KT
INFO[0002] Handled crawl result from worker crawler-15   crawled=5 crawlerID=crawler-15 dialErr=no_route_to_host inCrawlQueue=13 isDialable=false remoteID=12D3KooWNXxUpvXR
INFO[0002] Handled crawl result from worker crawler-10   crawled=6 crawlerID=crawler-10 dialErr=no_route_to_host inCrawlQueue=12 isDialable=false remoteID=12D3KooWHifNgw5A
INFO[0002] Persisted result from worker crawler-01       duration="115.75µs" persisted=1 persisterID=persister-09 remoteID=12D3KooWCFhiL2KT success=true
INFO[0002] Persisted result from worker crawler-10       duration="136.667µs" persisted=1 persisterID=persister-13 remoteID=12D3KooWHifNgw5A success=true
INFO[0002] Persisted result from worker crawler-15       duration="141.708µs" persisted=1 persisterID=persister-11 remoteID=12D3KooWNXxUpvXR success=true
INFO[0002] Handled crawl result from worker crawler-33   crawled=7 crawlerID=crawler-33 dialErr=no_route_to_host inCrawlQueue=11 isDialable=false remoteID=12D3KooWRR9xXohx
INFO[0002] Persisted result from worker crawler-33       duration="38.458µs" persisted=1 persisterID=persister-15 remoteID=12D3KooWRR9xXohx success=true
INFO[0002] Handled crawl result from worker crawler-13   crawled=8 crawlerID=crawler-13 dialErr=no_route_to_host inCrawlQueue=10 isDialable=false remoteID=12D3KooWT1XhyySJ
INFO[0002] Persisted result from worker crawler-13       duration="29.792µs" persisted=1 persisterID=persister-17 remoteID=12D3KooWT1XhyySJ success=true
INFO[0002] Handled crawl result from worker crawler-28   crawled=9 crawlerID=crawler-28 dialErr=connection_refused inCrawlQueue=9 isDialable=false remoteID=12D3KooWP2Fw6jjM
INFO[0002] Persisted result from worker crawler-28       duration="113.334µs" persisted=1 persisterID=persister-19 remoteID=12D3KooWP2Fw6jjM success=true
INFO[0002] Handled crawl result from worker crawler-12   crawled=10 crawlerID=crawler-12 dialErr=connection_refused inCrawlQueue=8 isDialable=false remoteID=12D3KooWESEcXEaR
INFO[0002] Persisted result from worker crawler-12       duration="34.291µs" persisted=1 persisterID=persister-01 remoteID=12D3KooWESEcXEaR success=true
INFO[0002] Handled crawl result from worker crawler-30   crawled=11 crawlerID=crawler-30 dialErr=connection_refused inCrawlQueue=7 isDialable=false remoteID=12D3KooWKVFgCLPx
INFO[0002] Handled crawl result from worker crawler-11   crawled=12 crawlerID=crawler-11 dialErr=connection_refused inCrawlQueue=6 isDialable=false remoteID=12D3KooWN5q4Dop3
INFO[0002] Persisted result from worker crawler-11       duration="39.042µs" persisted=2 persisterID=persister-05 remoteID=12D3KooWN5q4Dop3 success=true
INFO[0002] Persisted result from worker crawler-30       duration="62.5µs" persisted=2 persisterID=persister-03 remoteID=12D3KooWKVFgCLPx success=true
INFO[0002] Handled crawl result from worker crawler-08   crawled=13 crawlerID=crawler-08 dialErr=connection_refused inCrawlQueue=5 isDialable=false remoteID=12D3KooWKjcA9F57
INFO[0002] Persisted result from worker crawler-04       duration="23.916µs" persisted=2 persisterID=persister-09 remoteID=12D3KooWDrjnr2Lo success=true
INFO[0002] Handled crawl result from worker crawler-04   crawled=14 crawlerID=crawler-04 dialErr=connection_refused inCrawlQueue=4 isDialable=false remoteID=12D3KooWDrjnr2Lo
INFO[0002] Persisted result from worker crawler-08       duration="80.833µs" persisted=2 persisterID=persister-07 remoteID=12D3KooWKjcA9F57 success=true
INFO[0007] Handled crawl result from worker crawler-05   crawled=15 crawlerID=crawler-05 dialErr=io_timeout inCrawlQueue=3 isDialable=false remoteID=12D3KooWNxgS64xm
INFO[0007] Persisted result from worker crawler-05       duration="139.417µs" persisted=2 persisterID=persister-13 remoteID=12D3KooWNxgS64xm success=true
INFO[0007] Handled crawl result from worker crawler-26   crawled=16 crawlerID=crawler-26 dialErr=io_timeout inCrawlQueue=2 isDialable=false remoteID=12D3KooWK9FB4RRB
INFO[0007] Persisted result from worker crawler-26       duration="81.417µs" persisted=2 persisterID=persister-11 remoteID=12D3KooWK9FB4RRB success=true
INFO[0007] Handled crawl result from worker crawler-09   crawled=17 crawlerID=crawler-09 dialErr=io_timeout inCrawlQueue=1 isDialable=false remoteID=12D3KooWEaPTniz8
INFO[0007] Persisted result from worker crawler-09       duration="50.25µs" persisted=2 persisterID=persister-15 remoteID=12D3KooWEaPTniz8 success=true
INFO[0017] Handled crawl result from worker crawler-14   crawled=18 crawlerID=crawler-14 dialErr=io_timeout inCrawlQueue=0 isDialable=false remoteID=12D3KooWQdyP9b93
INFO[0017] Persisted result from worker crawler-14       duration="144.209µs" persisted=2 persisterID=persister-17 remoteID=12D3KooWQdyP9b93 success=true
INFO[0017] Waiting for persister to stop                 persisterID=persister-01
INFO[0017] Waiting for persister to stop                 persisterID=persister-03
INFO[0017] Waiting for persister to stop                 persisterID=persister-05
INFO[0017] Waiting for persister to stop                 persisterID=persister-07
INFO[0017] Waiting for persister to stop                 persisterID=persister-09
INFO[0017] Waiting for persister to stop                 persisterID=persister-11
INFO[0017] Waiting for persister to stop                 persisterID=persister-13
INFO[0017] Waiting for persister to stop                 persisterID=persister-15
INFO[0017] Waiting for persister to stop                 persisterID=persister-17
INFO[0017] Waiting for persister to stop                 persisterID=persister-19
INFO[0017] Persisting crawl result...
INFO[0017] Persisting crawl properties...
INFO[0017] Logging crawl results...
INFO[0017]
INFO[0017] Dial Error                                    count=4 value=no_route_to_host
INFO[0017] Dial Error                                    count=6 value=connection_refused
INFO[0017] Dial Error                                    count=4 value=io_timeout
INFO[0017] Dial Error                                    count=1 value=no_public_ip
INFO[0017]
INFO[0017] Agent                                         count=1 value="Substrate Node/v4.0.0-dev-a2bd00b67e2 (Node-Thien)"
INFO[0017] Agent                                         count=1 value="Substrate Node/v4.0.0-dev-4a055dcec9b (secondValidator)"
INFO[0017] Agent                                         count=1 value="Substrate Node/v4.0.0-dev-unknown (VetoNode)"
INFO[0017] Agent                                         count=15 value=
INFO[0017]
INFO[0017] Protocol                                      count=3 value=/1ddce31fac84071085205b3a01888178bba5b08cb30a668a732c9d13939081c5/kad
INFO[0017] Protocol                                      count=3 value=/sup/light/2
INFO[0017] Protocol                                      count=3 value=/1ddce31fac84071085205b3a01888178bba5b08cb30a668a732c9d13939081c5/state/2
INFO[0017] Protocol                                      count=3 value=/sup/sync/2
INFO[0017] Protocol                                      count=3 value=/1ddce31fac84071085205b3a01888178bba5b08cb30a668a732c9d13939081c5/block-announces/1
INFO[0017] Protocol                                      count=3 value=/sup/sync/warp
INFO[0017] Protocol                                      count=3 value=/sup/state/2
INFO[0017] Protocol                                      count=3 value=/sup/block-announces/1
INFO[0017] Protocol                                      count=3 value=/ipfs/ping/1.0.0
INFO[0017] Protocol                                      count=3 value=/paritytech/grandpa/1
INFO[0017] Protocol                                      count=3 value=/1ddce31fac84071085205b3a01888178bba5b08cb30a668a732c9d13939081c5/grandpa/1
INFO[0017] Protocol                                      count=3 value=/sup/kad
INFO[0017] Protocol                                      count=3 value=/ipfs/id/push/1.0.0
INFO[0017] Protocol                                      count=3 value=/ipfs/id/1.0.0
INFO[0017] Protocol                                      count=3 value=/1ddce31fac84071085205b3a01888178bba5b08cb30a668a732c9d13939081c5/sync/2
INFO[0017] Protocol                                      count=3 value=/1ddce31fac84071085205b3a01888178bba5b08cb30a668a732c9d13939081c5/transactions/1
INFO[0017] Protocol                                      count=3 value=/1ddce31fac84071085205b3a01888178bba5b08cb30a668a732c9d13939081c5/light/2
INFO[0017] Protocol                                      count=3 value=/sup/transactions/1
INFO[0017] Protocol                                      count=3 value=/1ddce31fac84071085205b3a01888178bba5b08cb30a668a732c9d13939081c5/sync/warp
INFO[0017]
INFO[0017] Finished crawl                                crawlDuration=17.06933025s crawledPeers=18 dialablePeers=3 undialablePeers=15

This returns more nodes but also not really many. How many did you expect?

Edit: I just saw that there's also /1ddce31fac84071085205b3a01888178bba5b08cb30a668a732c9d13939081c5/kad. This gives the same result as the log output above, so I guess it's not structurally different.

dennis-tra commented 1 year ago

Out of curiosity, what's your project about and what information do you want to get out of Nebula? :)

duythien commented 1 year ago

@dennis-tra it seem look good for now. Let me test new something