Epic: Implement `GENESISID`

lrettig commented 1 year ago

Based on https://github.com/spacemeshos/SMIPS/issues/75, replaces spacemeshos/pm#117 for genesis

[ ] Initial setup/prerequisites, config, devops
- [x] Add separate genesis section to config and ensure that it is immutable
- [x] Add one new field to new genesis config: genesis-extradata (string with a max length of 255 chars.). Default value should be mainnet.
- [x] Remove p2p.network-id from config (it's superseded by the above and no longer necessary).
- [ ] Update devops tooling to update this config value, same as how genesis-time is updated, when launching a new network (https://github.com/spacemeshos/go-spacecraft/issues/38)
- [ ] Update versioning scheme. main.go should receive and set a 20 byte hash (byte array) value called cmd.GenesisID. (See LDFLAGS in Makefile.)
[x] When the node starts, it should read genesis-extradata from config, zero-pad the string and turn it into a fixed-length byte array. It should then calculate the ID as follows:GENESISID := hash(genesis-time || genesis-extradata)
[x] Set goldenATXIDto GENESISID (node.go :: initServices) and remove golden-atx from the config file (note: in future we may want to do this the other way around, and include the golden ATX ID in the GENESISID)
[ ] Signer/VRF
- [ ] Smesher ID: this is based on nodeID and must also include GENESISID. For this we need POPS-VRF (#168 #172), but that task is distinct and for the purposes of this task we may assume it's already in place.
- [ ] ED signer should also be bound to GENESISID. Modify signing/signer.go to concatenate GENESISID onto m in Sign and Verify. (note: VRF signer does not need to be updated since it should already factor in the smesher ID nonce as part of POPS-VRF)
[ ] Data structures/p2p messages: make sure that each of these is signed using ED signer and/or uses VRF signer that includes nonce (as described above):
- [ ] ATX
- [ ] Block (not needed? since these are not signed. see https://github.com/spacemeshos/SMIPS/issues/75#issuecomment-1173123977)
- [ ] Proposals/Ballots
- [ ] Hare messages
- [ ] Weak coin messages
- [ ] Beacon messages
- [ ] Hare certificate
[ ] Transactions
- [ ] The default verify() method should use the modified ED signer (which includes GENESISID, as described above) (note: in future we may want to expose an unmodified signer to template code, but this isn't necessary for genesis)
- [ ] Smapp (and other clients) should include GENESISID when signing a new tx, so the verify() sig check passes (https://github.com/spacemeshos/smapp/issues/949)
[ ] P2P: Remove cfg.NetworkID, replace with GENESISID in handshake. (Note: a malicious peer could still send invalid data after a handshake, so all of the other checks here are still required.)
[ ] Tests: test that each of [block, ballot, transaction, proposal, ATX, hare message, hare certificate, transaction] referencing/tied to an invalid GENESISID isn't gossipped, and that the peer that sent it is banned (gossip, sync)

Notes/Things to look for:

Any p2p message handlers which don't check GENESISID - typically implicitly, via a signature check. Note that p2p IDs are not tied to GENESISID, so there is no implicit check at the p2p layer. Checks must happen in higher level handlers or data structures and their signatures. To check: weak coin, beacon.
We haven't implemented peer banning yet, have we? If not, need to open a separate issue that makes sure that banning happens on bad GENESISID.

dshulyak commented 1 year ago

Is there any other chain that is doing the same for protocol objects (not transactions)?
Why do we need to store it in the database? How verification will look like if there will be more than one record in db? Compute hash for every record, e.g if there are 10 records we will be computing hashes until the first one is valid?
It is super important for transactions to include chain specific data for replay protection. how do we address that?

lrettig commented 1 year ago

Is there any other chain that is doing the same for protocol objects (not transactions)?

Good question. I think we could do more homework on this. I suspect not. I suspect other chains make do with: 1. versioning in P2P handshake, 2. blacklisting/banning peers that send bad data, and 3. protocol objects that fail validation (without an explicit GENESISID) since they'd point to another object that's unknown.

In the case of a blockchain protocol like Eth, each block must point to the previous block, and each block contains txs that contain chainids, etc., so there's no need to tag blocks explicitly with a chainid or genesisid. This is different for Spacemesh. We do get some of this "default protection" for objects like blocks, proposals, ATXs, etc. (see notes above) because they link to other objects that should be valid and in-context, but I'm not sure we have any other way to assert that, say, a smesher or nodeID wasn't reused on another chain (or fork).

Geth does some ad hoc versioning of, e.g., blocks: here's an example where a legacy sync will reject post-merge blocks.

And here's an example of a post-1559 chain rejecting pre-1559 blocks. These both resemble PROTOCOLID more than they resemble GENESISID.

The logic Eth uses at the P2P layer is laid out in EIP-2124.

Why do we need to store it in the database? How verification will look like if there will be more than one record in db? Compute hash for every record, e.g if there are 10 records we will be computing hashes until the first one is valid?

This is not an issue for GENESISID, as this never changes. It might be an issue for PROTOCOLID and VMID, but let's save the discussion for later when we implement these, or move it to https://github.com/spacemeshos/SMIPS/issues/75. Note: ideally we'd use comparison logic like that laid out here. Doing this once on P2P handshake is feasible, but I agree that doing it every time an object is received is probably too costly.

It is super important for transactions to include chain specific data for replay protection. how do we address that?

It'll be handled inside the principal account templates. The default (suggested, enshrined) template will handle it as part of the signature check as specified above; clients like smapp should include GENESISID when signing, perhaps as part of the SVM codec.

pigmej commented 1 year ago

@lrettig @dshulyak I wonder if we can treat it as finished or we want to clarify some items here still?

lrettig commented 1 year ago

My open questions are here: https://community.spacemesh.io/t/where-to-bind-genesisid/284/2 but from my perspective this is basically done.

dshulyak commented 1 year ago

just to clarify, i think this part should not be implemented for genesis, maybe it will make more sense later. which i still not certain because protocolid/vmid is something that can be stored in code.

Add a new database table to store GENESISID. Note that, for now, this table will only store a single row/single value that never changes, but in future the same table will be used to store multiple valid combinations of GENESISID, VMID and PROTOCOLID. The schema is simply:

lrettig commented 1 year ago

this part should not be implemented for genesis

Dumb question: if it's not stored in the database, where is it stored?

dshulyak commented 1 year ago

Dumb question: if it's not stored in the database, where is it stored?

the only reason to persist it is to protect administrator from changing this accidentally again, is it the case? are we going to have other immutable configuration parameters, such as genesis time, golden atx, hrp that can't be overwritten? if thats the case we can just persist whole immutable config part in a file and check that none of the those parameters were overwritten

i initially thought that we can set all genesis parameters at compile time (one example of this approach is https://github.com/filecoin-project/lotus/tree/master/build) , but i guess genesis time may not be convenient to set this way

lrettig commented 1 year ago

I understand now, thanks. Yes, that's the plan, see https://github.com/spacemeshos/go-spacemesh/issues/3348. Ideally we would let a developer run the same go-spacemesh code on multiple networks without recompiling, but I'm not totally opposed to this idea.

dshulyak commented 1 year ago

I think this can't be implemented the way it is written. I initially missed that this spec suggests to append genesisid to the messages that are signed, not only hashes. It basically will require full buf allocation and copy every time when message is signed and verified. If message is 3kb - it will need 3kb + 32 for buf with genesis. There should be another way to do binding, that doesn't carry such huge cost.

One option could be is to use ed25519ctx extention. Here https://github.com/oasisprotocol/curve25519-voi/blob/master/primitives/ed25519/ed25519.go#L130-L136 . i don't know if spacemesh lib for ed25519 supports that though.

Also this is not something that can be completely optimized in the future. Whatever is done at genesis will have to be remain supported for old data in the chain.

dshulyak commented 1 year ago

this part also surprises me

Smesher ID: this is based on nodeID. NodeID must include GENESISID. In cmd/node/node.go :: Start(), set nodeID to the edPubkey.Bytes() hashed against GENESISID.

it means that every time when public key is recovered from signature we will have to additionaly compute node id using genesis id. what purpose does it serve, if message is signed in a separate domain?

given that we are using separate domain for signatures.. why do we need to add genesisid for hashes?

dshulyak commented 1 year ago

what objects are actually prehashed before they are signed?

i think this only true for transactions. ballots/activations are not prehashed

another option could be to prehash every message that is signed (without allocating another huge memory buffer).

fun Sign(pk PrivateKey, msg []byte) []byte {
    hasher := hash.New()
    hasher.Write(GENESISID)
    hasher.Write(msg)
    return ed25519.Sign(pk, hasher.Sum(nil))
}

pigmej commented 1 year ago

@lrettig @dshulyak @WilfredTA given the pending discussions I removed the split, let's first please finalize the design and then split to smaller work items.

dshulyak commented 1 year ago

just to elaborate on my comments above.

the proposed design doesn't take into account efficiency completely. some of what is proposed won't be possible to optimize away after genesis (such decisions like adding genesis id to payload before signing it).

also lets consider how this will look in practice for verifying ballot:

compute id after receiving bytes from network (genesis id added to the hash)
recover public key from signature (add genesis id to the payload)
compute node id for internal logic (genesis id added again)

doesn't it seem redundant? i can't believe that all of it is really necessary

dshulyak commented 1 year ago

i propose the following changes:

introduce SafeEncode function it will return two byte slices (one for signing, and another for gossip). First buffer will be passed to signers, second for gossip. I can provide save implementation ideas if necessary.
Introduce SafeHash function. It will be used only in vm. Other hashes already committed to genesis using SafeEncode.
Drop commitment for NodeID. It seems it is necessary only for Post. Can we use another approach, like include genesisid as a prefix for hash function used for post generation?

lrettig commented 1 year ago

@dshulyak see Where to bind GENESISID. For now, we only need to bind the GENESISID in two places: PoST data and transactions. For PoST data, see https://github.com/spacemeshos/go-spacemesh/issues/3327. For transactions, we should include GENESISID in the hash that's signed. I don't think GENESISID needs to be hashed in anywhere else.

(As Tal pointed out in future when we introduce multiple protocol versions, we will need to include these in ATXs, ballots, etc.)

Note that this hash function can be made very efficient if necessary by compiling GENESISID into the hash preimage.

Regarding your ballot example:

compute id after receiving bytes from network (genesis id added to the hash)

recover public key from signature (add genesis id to the payload)

compute node id for internal logic (genesis id added again)

Unless I'm misunderstanding, none of these steps are necessary. The GENESISID check will be implicit in checking the ballot eligibility.

pigmej commented 1 year ago

Drop commitment for NodeID. It seems it is necessary only for Post. Can we use another approach, like include genesisid as a prefix for hash function used for post generation?

@dshulyak speaking about that we also have https://github.com/spacemeshos/go-spacemesh/issues/3327 that needs to be tackled :)

dshulyak commented 1 year ago

i read that research topic, it has 3 conclusions:

post needs to be committed to a particular network
transactions need to have a replay protection.
broadcasted messages have to be bound to a particular network both for additional safety and efficiency for efficiency to avoid downloading total chain of objects to realize that it is committed to a different network (when we will verify initial post)

post needs to be committed to a particular network

what about requirement to bind node id to genesis id? there is no conclusion in the research topic about it. post is likely already bound to a particular node id, see https://github.com/spacemeshos/go-spacemesh/issues/3327#issuecomment-1207783388 . so we are missing commitment of the post to genesisid.

do you want to merge both together? can we do it in a way that doesn't require computing node id from public key by hashing it together with genesisid when every message is received?

based on my reading of the code we don't need to bind node id to genesis id, we need to:

add genesis id when id is passed to the initializer https://github.com/spacemeshos/go-spacemesh/blob/develop/activation/post.go#L263
add genesis id for verifier https://github.com/spacemeshos/go-spacemesh/blob/develop/activation/validation.go#L80

this will ensure that post is generated by the node id for a particular genesis hash

broadcasted messages have to be bound a particular network both for additional safety and efficiency

spec implies that hashes are somehow needed for this, but they aren't. they are simply redundant (e.g don't do anything). maybe there was an assumption that we are signing hashes, and then we don't need to change code for signers.

what sounds reasonable for me is either:

to have a generic type. This type will be signed, but broadcasted only a part from a Message. This is also can be done in a memory efficient way (e.g don't add genesisid in the signer itself).

type BoundMessage[M Message] struct {
     Genesis [32]byte
     Message M
}

or

prehash every object before signing (this idea came up in the research forum as well). in this case hashing is not redundant

lrettig commented 1 year ago

@dshulyak you refer several times to the cost of hashing, e.g.,

can we do it in a way that doesn't require computing node id from public key by hashing it together with genesisid when every message is received?

but as discussed in the research forum thread, hashing can be made effectively zero cost by including the genesisid in the hash preimage. There are a few approaches here; one of them may require running a build script to compile in the change in a network-specific (genesisid-specific) fashion.

lrettig commented 1 year ago

I updated this epic and opened some related issues based on our conversation with @tal-m this week. Here are some of the changes I made:

no need to change a hash function (we're including genesisID in the signer)
no need to change the VRF signer (this will be handled as part of the POPS-VRF task)
updated and clarified https://github.com/spacemeshos/go-spacemesh/issues/3348 (in scope for this task)
opened #168 (not in scope for this task)
opened and linked issues to add genesis ID support to go-spacecraft and smapp
no need to add genesisID to the database (we'll use an immutable genesis config instead)
combine golden ATX ID and genesisID (for now)
we no longer rely on the fact that data structures (such as ATX, proposal) are "downstream" of valid smesherIDs/ATXs that are committed to genesisID; rather than performing expensive, recursive checks on these, we ensure that they're signed using genesisID
no need to change ATX generation to explicitly include genesisID; this should not be necessary after https://github.com/spacemeshos/go-spacemesh/issues/3327 (out of scope for this task)

@dshulyak @noamnelke @selfdual-brain please let me know if I've missed or misremembered anything. Back to you @WilfredTA

dshulyak commented 1 year ago

The default verify() method should use the modified ED signer (which includes GENESISID, as described above) (note: in future we may want to expose an unmodified signer to template code, but this isn't necessary for genesis)

this is totally different from what we agreed on the call. we are signing the hash of the transaction. signer should prefix the input with genesisid when preparing this hash.

lrettig commented 1 year ago

The spec says:

The default verify() method should use the modified ED signer

ED signer should also be bound to GENESISID. Modify signing/signer.go to concatenate GENESISID onto m in Sign and Verify.

You proposed:

signer should prefix the input with genesisid when preparing this hash

What's the difference?

dshulyak commented 1 year ago

I thought that we agreed to prefix encoded tx that is passed to the hash function (https://github.com/spacemeshos/go-spacemesh/blob/develop/genvm/sdk/wallet/tx.go#L48-L49)

it will be

hash.Write(genesisid)
hash.Write(tx) // where tx is a concatenation of fields

and edsigner still receives just a hash

WilfredTA commented 1 year ago

What is the difference in practice between signing the concatenation of Sign(GenesisID + Hash(tx)) versus Sign(Hash(GenesisID + tx))?

It depends maybe on if there is a case where we want to recover GenesisID from the signature

lrettig commented 1 year ago

I don't have a strong preference on this, it feels like a question of semantics to me (does default verify() implementation use the modified ED signer or the unmodified signer?) that templates can override in future anyway. Let me know if I'm misunderstanding something.

Will give it some more thought.

dshulyak commented 1 year ago

in this case this is simpler. we are already hashing tx. adding genesisid into the hash won't require additional buffer or ed25519 library that supports streaming input for it to be efficient. but from my pow this is better simply because it is simpler.

another thing is that we probably should be using ed25519ph (that uses HashEddsa as in https://www.rfc-editor.org/rfc/rfc8032). It skips internal round of hashing, and requires input to be already prehashed. and this case prehashed tx would be required.

pigmej commented 1 year ago

Was implemented in the issues

spacemeshos / pm

Epic: Implement `GENESISID` #135