Closed dpc closed 1 year ago
@elsirion
I haven't really thought through all the details, and I hope exactly that looking at the code will make the vision and discussion more precise.
I agree that in many cases importing multiple versions of module codebase is way better idea than cramming it together glued with bunch of if statements. But then - won't majority of cases be trivial? E.g. no changes?
One thing that is not clarified (in my head and in the code) is if module will have its own counter for consensus versions and then there be a global one for core Fedimint protocol. I guess it has to be both, where core protocol consensus works like like a major version for each protocol instance.
In that case it should be
fn version(core: CoreConsensusVersions) -> [ModuleConsensusVersion]
or
fn version(core: CoreConsensusVersions) -> ModuleConsensusVersion
if we want to enforce that any-time module does any change in its own consensus, it needs to make a new codebase.
?
I don't think the module consensus version has to be a function of the FM consensus version. Either the module needs to be adapted to work with a newer FM version (likely) then it can change its own version or not, then there is no sense in changing the module version.
EDIT: but to your general point: yes, Fedimint needs its own set of consensus, API, DB versions separate from any modules.
Either the module needs to be adapted to work with a newer FM version (likely) then it can change its own version or not, then there is no sense in changing the module version.
That means it is a function of fm consensus version, no?
We at least should aim at not letting users make accidental mistakes.
So
fn version(core: CoreConsensusVersions) -> [ModuleConsensusVersion]
If FM changes consensus from 2 to 3, and module doesn't care then:
module_init.versions(2)== module_init.module_version(3).first()
If it does, then it will have it's own version change and
module_init.versions(2).last() + 1 == module_init.module_version(3).first()
By having this function we can check for the user that all the modules are compatible with FM consensus changes. Initially just bunch of checks and crashing with an error message explaining it, but in the future we might even support automated scheduled consensus changes where guardians use UI to agree on the consensus change that gets scheduled and "switched into" automatically at the designed epoch.
Maybe I'm wrong here, but in my mind a new consensus version of Fedimint is a new major code version, so the old module wouldn't compile with the new Fedimint anyway. So the
module_init.versions(2).last() + 1 == module_init.module_version(3).first()
case doesn't really exist.
I also don't get what first/last of the returned slice would be. What's the semantic of the returned array? Supported versions? But how would a version be even selected when e.g. processing an output? We don't have version fields in all our consensus relevant structs rn, are we planning on changing that?
So far I assumed we just treat a new consensus version of a module as a completely new module with its own instance id and a related kind and let clients figure out migration. I would like to have a version field in the module kind type for that.
But how would a version be even selected when e.g. processing an output?
Versions are on ModuleGen
so what it means is: this module codebase can instantiate module instance for the following module consensus versions. This could be used e.g. in the Admin UI to tell user that their codbase is/isn't compatible with module consensus upgrade guardians are trying to schedule, and prompt them to upgrade their binary.
In the simplest case, a module returns just one version meaning - this code base supports only this module consensus version.
Maybe I'm wrong here, but in my mind a new consensus version of Fedimint is a new major code version, so the old module wouldn't compile with the new Fedimint anyway.
I don't think this is necessary. Initially we probably will tie them together for simplicity, but our module code API surface layer is somewhat narrow, and nicely typed.
We might at some point change some consensus rules which most (all) of the module won't care about, and they won't have to change a single line of code, other than adding the new consensus to their list of supported FM consensus version, purely as a safeguard.
On the other hand, we might want to change some module trait APIs, without actually changing consensus rules.
And sometimes we might do both at the same time?
So far I assumed we just treat a new consensus version of a module as a completely new module with its own instance id and a related kind and let clients figure out migration. I would like to have a version field in the module kind type for that.
I guess there's an overlap here. You might be right. I need to think about about it a bit longer and run thought more scenarios for theoretical modules, and how practical it is. For a module like mint such a static migration seems very nice, but I wonder if it will always be sufficient.
Currently we have a CODE_VERSION
which is stored in the ServerConsensusConfig
so that if they don't match, servers will fail to join consensus. How would these new versions interact with the existing one?
Currently we have a
CODE_VERSION
which is stored in theServerConsensusConfig
so that if they don't match, servers will fail to join consensus. How would these new versions interact with the existing one?
Replace. We should compare consensus versions (global FM one, and each module's instance). If the code says it's compatible, we believe it.
@elsirion
I guess there's an overlap here. You might be right. I need to think about about it a bit longer and run thought more scenarios for theoretical modules, and how practical it is. For a module like mint such a static migration seems very nice, but I wonder if it will always be sufficient.
OK, so I think it's not hard to come up with examples where rolling over to new instance is not really desireable / possible.
Let's say we have a smart contract module, Ethereum style. Let's ignore for a second if that's a bad idea to do it. Since the contracts in the existing instance interact with other contracts, this creates a strong network effect.
Migrating to another instance with new rules in slow motion is close to impossible. The module becomes it's own side-chain, not unlike any alt-coin, and a different instance of it would be just a minority contender. What the module authors want is a scheduled hard-fork to change the rules (e.g fee structure), and not a slow migration.
Similarly if the module is storing a lot of long term backup (like a threshold encrypted-backup), a module consensus hard-fork is preferred over bothering users with migration.
So, yeah - I don't think rolling over is going to work in every case. Thought I do agree that when possible it's very elegant. The module consensus upgrade is orthogonal to it. A mint-v2
KIND
becomes a completely different module kind than mint-v1
(if we do it through just different KIND
string or an extra "module kind version" is implementation detail). It will be up to module developers to decide if rolling over to a new kind&instance is preferable over "module consensus hard-fork", and we will guide the user through this difference in the UIs.
Currently we have a
CODE_VERSION
which is stored in theServerConsensusConfig
so that if they don't match, servers will fail to join consensus. How would these new versions interact with the existing one?Replace. We should compare consensus versions (global FM one, and each module's instance). If the code says it's compatible, we believe it.
So we could run with different binary versions potentially? Seems like that could be error-prone.
If this is replacing CODE_VERSION can we also remove it and put these versions into the consensus config then?
So we could run with different binary versions potentially? Seems like that could be error-prone.
Eventually we have to discard the training wheels and graduate to the big league of distributed software.
Right now the stakes are low, so it is a good opportunity to learn and gain some experience.
Whenever we touch anything that smells like breaking consensus, we just increment the corresponding number and it comes to the same thing.
If this is replacing CODE_VERSION can we also remove it and put these versions into the consensus config then?
I am not doing it in this PR, but yes, that would exactly be the next step. We start comparing consensus versions between peers, and module code compatibility against the federation consensus and we error out if anything is wrong.
Sounds reasonable, will look at it again with fresh eyes tomorrow, not too opposed to land in this form anymore. I still think one code version only supporting one consensus version would be more elegant, but this flexibility is likely enough to be useful in the not too distant future to make sense.
Dev call: add test to CI to assert that version compatibility didn't change if not declared explicitely
I amended with changes corresponding to some conclusions from the discussion here.
Let's try this! Next steps are inclusion in cfg/client side checks?
Let's try this! Next steps are inclusion in cfg/client side checks?
I think so, yes. And we have to remember to try to increment these numbers when we break api/consensus rules. Not a big deal if we forget, but we should try.
Currently rebased on top of #1575 to avoid merge conflict. See only the last commit.
Implement the basic types and module-related primitves to support versioning.
These are currently not used for anything, and here only for moving forward with the discussion from https://github.com/fedimint/fedimint/issues/1548.