p4lang / p4runtime

Specification documents for the P4Runtime control-plane API
Apache License 2.0
140 stars 86 forks source link

logic behind large election ID #393

Closed adibrastegarnia closed 1 year ago

adibrastegarnia commented 2 years ago

It looks like election ID is 128 bits which is huge for this purpose. I don't think election ID is going to change frequently that requires this huge number. Even if we want to be conservative uint64 is more than enough. What is the logic behind it?

antoninbas commented 2 years ago

@adibrastegarnia That's a good question. It's been a while since we took that decision and I think it came from Google so I imagine that they have a leader election mechanism internally that may produce 128-bit ids. I am not an expert on consensus protocols, but I imagine that using 128 bits lets us have a 64-bit term id (or round id) plus a 64-bit node id. Note that if your consensus protocol only requires 64-bit of space, you should feel free to always set the high 64 bits to 0. A P4Runtime server implementation should not care about how you generate these ids. It only cares about their numerical value and their ordering. So if you only need 64 bits of space total, you can organize the ids as you want: <32bits of 0s><32bits of term id><32bits of 0s><32bits of node id> <64bits of 0s><32bits of term id><32bits of node id> <64bits of 0s><64bits of id> <64bits of id><64bits of 0s>

The added complexity on the client side and server side, that comes from using 2 uint64 values, is quite small IMO. It's not like the election id value needs to traverse the stack. It really should be handled locally on the Southbound of the client and the Northbound of the server. In this server implementation for example, the election id is only used in that one file, to enforce arbitration.

kuujo commented 2 years ago

Yeah this makes sense to me. I suspect something like Google's Chubby lock service could generate very large epochs like this since it's so widely used and may not be isolated to a single controller.