Closed SimonSK closed 6 years ago
I think saving the container image and pulling it makes sense, instead of regenerating every time, especially if it takes significantly longer to regenerate the image.
@zzma it actually doesn't take that long (less than 30s). the script is basically pulling an existing mysql 5.7 image and runs my python script to create tables. i think we won't have to deal with this a lot once we have a database running, so i will leave it as is for now.
If it helps, we should also create index on node_id
fields if we do a lot of lookups by node_id
Some thoughts on the tables:
neighbors
and node_id_hash
tables as is. neighbors
uses all fields as a composite primary key and counts number of occurrence for each entry if exact match. node_id_hash
uses node_id
as the primary key.devp2p_hello
and eth_status
tables into one. If any of the fields (except total difficulty and best hash) changes, we consider it a new record.id
field as the primary key.first_ts
and last_ts
field names to first_received_at
and last_received_at
node_id
field.node_id
field, i'm considering prefix indexing.
According to https://dev.mysql.com/doc/refman/5.5/en/column-indexes.html, fulltext index is only supported by MyISAM. We could index by all 128 characters of node_id
, but I'm not sure if that would be necessary. Should I consider only the first 16 characters or something?I will create a separate PR for the data insert and search, but here is my current plan:
nodeId
as the key.nodeId
, get the entry with MAX(id)
, update the last_received_at
and countertoo many peers
, then receives new information latertoo many peers
on the next scantoo many peers
node_id
, each one with MAX(id)
rename node_id_hash
to node_properties
and move counter fields to it
@SimonSK I think renaming node_id_hash
to node_meta_info
would be more appropriate.
This creates a mysql docker container that hosts our node database. The script assumes docker (and docker-compose), python2, and pip2 are installed. I prefer isolating the database in a container as I don't feel comfortable messing with mysql server hosted on the server. The actual database directory is stored directly on the host (or the nfs if that's what we set to) and attached to the container.
How to use:
config.py
andethnodes/.env
.sudo ./setup.sh
On a second thought, maybe I should just save the container image, push it to a private repo, and just pull it.
The database contains 4 different tables:
neighbors
,node_id_hash
,devp2p_hello
,eth_status
.neighbors
: node addresses from the udpNEIGHBORS
replies go herenode_id_hash
: mapping betweennodeid
and its sha3 (keccak256). I haven't decided if I want to keep looking into the XOR distance, but I would like to keep this information for now.devp2p_hello
: DEVp2p Hello contents go here. Only the remote ephemeral port, counters, and timestamp fields get overwritten. If any of other fields changes, a new row is added.eth_status
: Ethereum Status contents go here. Only the dao-fork check, counter, and timestamp fields get overwritten. If any of other fields changes, a new row is added.