CAFECA-IO / BAIFA-web-crawling

BAIFA conducts web crawling from iSunCloud in frequence
0 stars 0 forks source link

設計針對iSunCloud鏈的爬蟲策略,包括數據同步機制 #2

Closed gibbs-shih closed 11 months ago

gibbs-shih commented 11 months ago

定期從iSunCloud鏈爬蟲每個block及其所屬transactions資料,完成數據同步,將raw data存入資料庫

gibbs-shih commented 11 months ago
  1. 申請INFURA帳號

  2. 研究並使用Postman測試JSON-RPC

    • eth_blockNumber
    • eth_getBlockByNumber
    • eth_getTransactionByHash
    • eth_getTransactionReceipt
截圖 2023-12-13 下午1 24 13

taking 2 hrs

gibbs-shih commented 11 months ago
  1. 尋找區塊鏈/爬蟲相關sequence diagram

  2. iSunCloud鏈的sequence diagram 截圖 2023-12-15 上午10 18 56(2)

執行規劃:

  1. 根據 blockNumber回來的result知道目前的最新的block

    截圖 2023-12-13 下午5 02 25
  2. 用getBlockByNumber從最新的block開始抓取result資料及該result中的所有transactions資料(詳見3)

    • 分成兩段: -- 從最新的block開始往前抓取到上一次抓取的block -- 若上一次的抓取未完成>依照block新到舊繼續完成抓取 截圖 2023-12-13 下午5 16 45

資料格式範例:

number: QUANTITY - the block number. null when its pending block.
hash: DATA, 32 Bytes - hash of the block. null when its pending block.
parentHash: DATA, 32 Bytes - hash of the parent block.
nonce: DATA, 8 Bytes - hash of the generated proof-of-work. null when its pending block.
sha3Uncles: DATA, 32 Bytes - SHA3 of the uncles data in the block.
logsBloom: DATA, 256 Bytes - the bloom filter for the logs of the block. null when its pending block.
transactionsRoot: DATA, 32 Bytes - the root of the transaction trie of the block.
stateRoot: DATA, 32 Bytes - the root of the final state trie of the block.
receiptsRoot: DATA, 32 Bytes - the root of the receipts trie of the block.
miner: DATA, 20 Bytes - the address of the beneficiary to whom the mining rewards were given.
difficulty: QUANTITY - integer of the difficulty for this block.
totalDifficulty: QUANTITY - integer of the total difficulty of the chain until this block.
extraData: DATA - the "extra data" field of this block.
size: QUANTITY - integer the size of this block in bytes.
gasLimit: QUANTITY - the maximum gas allowed in this block.
gasUsed: QUANTITY - the total used gas by all transactions in this block.
timestamp: QUANTITY - the unix timestamp for when the block was collated.
transactions: Array - Array of transaction objects, or 32 Bytes transaction hashes depending on the last given parameter.
uncles: Array - Array of uncle hashes. 
  1. 用getTransactionByHash及getTransactionReceipt抓取每個block中的所有transactions 截圖 2023-12-13 下午5 24 43

資料格式範例gettransactionbyhash

blockHash: DATA, 32 Bytes - hash of the block where this transaction was in. null when its pending.
blockNumber: QUANTITY - block number where this transaction was in. null when its pending.
from: DATA, 20 Bytes - address of the sender.
gas: QUANTITY - gas provided by the sender.
gasPrice: QUANTITY - gas price provided by the sender in Wei.
hash: DATA, 32 Bytes - hash of the transaction.
input: DATA - the data send along with the transaction.
nonce: QUANTITY - the number of transactions made by the sender prior to this one.
to: DATA, 20 Bytes - address of the receiver. null when its a contract creation transaction.
transactionIndex: QUANTITY - integer of the transactions index position in the block. null when its pending.
value: QUANTITY - value transferred in Wei.
v: QUANTITY - ECDSA recovery id
r: QUANTITY - ECDSA signature r
s: QUANTITY - ECDSA signature s

資料格式範例gettransactionreceipt

transactionHash : DATA, 32 Bytes - hash of the transaction.
transactionIndex: QUANTITY - integer of the transactions index position in the block.
blockHash: DATA, 32 Bytes - hash of the block where this transaction was in.
blockNumber: QUANTITY - block number where this transaction was in.
from: DATA, 20 Bytes - address of the sender.
to: DATA, 20 Bytes - address of the receiver. null when its a contract creation transaction.
cumulativeGasUsed : QUANTITY - The total amount of gas used when this transaction was executed in the block.
effectiveGasPrice : QUANTITY - The sum of the base fee and tip paid per unit of gas.
gasUsed : QUANTITY - The amount of gas used by this specific transaction alone.
contractAddress : DATA, 20 Bytes - The contract address created, if the transaction was a contract creation, otherwise null.
logs: Array - Array of log objects, which this transaction generated.
logsBloom: DATA, 256 Bytes - Bloom filter for light clients to quickly retrieve related logs.
type: QUANTITY - integer of the transaction type, 0x0 for legacy transactions, 0x1 for access list types, 0x2 for dynamic fees.

It also returns either :
root : DATA 32 bytes of post-transaction stateroot (pre Byzantium)
status: QUANTITY either 1 (success) or 0 (failure)

taking 3 hrs

gibbs-shih commented 11 months ago

補充sequence diagram and logic

  1. 兩段抓取邏輯 截圖 2023-12-14 下午3 05 50(2) 截圖 2023-12-14 下午3 35 38(2)

step1

step2

  1. 確保block中的所有transactions完成訪問 截圖 2023-12-15 上午10 18 43(2)

step1

step2

step3

taking 4hr