Closed SamuelMarks closed 4 years ago
I agree with all the points above. As the project becomes more complex with overlapping parts/ideas there is definitely a need for separation of concerns.
This will allow for:
Great idea.
Potential problems are:
I think it is not time to split go-lachesis repository in 12 yet. It might be sense if we have well working parts (consensus, network, etc) with no volatile interfaces. At the moment we don't have any of this: poset
and posposet
are in development stage now, poset
and node
is not splitted well, and so on. Partitioning by go-packages is enough for independently working at this stage. Single github repository is necessary for fast change review and discussion.
All we need is a review before merging (@SamuelMarks ;)) and cover note for changes (@dev10 ;)). And some knowledge about other teams tasks through coordinator. Separated repositories don't make it easier.
The first question is what this repository is for?
If it is for lachesis consensus algorithm as stated in the name, then swirlds, paxos, raft consensus implementations in the scheme above should better be in separate repositories.
If it is for, let call it 'Fantom consensus algorithm', which implements all of mentioned above consensus algorithms it is better to rename the repo and reorganise internal structure of it reserving place for all these algorithms.
But the root question is that we are in fact having two implementations of lachesis algorithm which are conflicting for going into master of this repo. These implementations would share some basic functionality (e.g. transport protocol, serialisation), though would need some specific functionality (e.g. storage engines) for better performance.
I think it would be better to retain this repo for lachesis consensus algorithm only, so we place other consensus algorithms in separate repos, but the implementations of these algos should use basic functionalities of this repo as much as possible. Then we need to try to harbour two lachesis implementations in this repo to make them interchangeable until we see which is better.
@Maxime2, great! I am about the same. We have a few ways to implement Lachesis consensus:
src/poset
- is a Swirlds transformation;src/posposet
- is coming in PR #161 (approve it, please);And it is nice for project, we will see which implementation is better later.
Thanks for your input. Will keep this proposal open for a little while, to give time for the team to give their viewpoints.
BTW: Here is what I was thinking in terms of repository dependency hierarchy:
For the -lib
repositories, it may be useful to do version negotiation at the config level here, so we can have, e.g.: two versions of lachesis running side by side so we can:
consensus-lib
would implement generic consensus tests)The shared common repository [repositories?] will do regular things like: logging, error handling, configuration handling and related, using proper libraries for this, but setting constants like log format, ensuring consistency across all our code bases.
As mentioned above, at this stage of development, divided into repositories can be premature and costly. Looks like dividing into go-packages in one repository will be enough for now. But maybe I just haven't information about implementation of the Raft, Paxos and Swirlds, who will work on it? Anyway, for the beginning you need to try identify common interfaces by which will be possible to divide the repository in the future. Also, as I see, there is no block that will start the service from specified in the configuration consensus and transport.
@Maxime2 is the following the most recent implementation of Jiho and Sang-min's work; https://github.com/Fantom-foundation/go-lachesis/tree/maxime-21-feb-lachesis-clotho
There are three primary objectives currently which are causing conflicts.
@devintegral can you do a diff on @Maxime2 consensus implementation to see if it can be merged into this? The PR itself looks good, we just need to merge 1 and 2 and then relook at 3.
@Maxime2 & @devintegral is it possible to merge these two? That would be first prize.
@andrecronje, it is impossible to merge 1 and 2, because they have different approaches: predetermined vs non predetermined peers list, on-going vs on-demand events processing, etc. We should drop one approach to merge code into single package. I suggest to have two consensus packages: 'src/poset' and 'src/posposet' at master, and choose one later, when we've got full knowlidge about their strengths.
I see it is possible to merge 1 and 3 into 'src/poset', because 3 is based on 1.
@andrecronje Here is how I see the state of things at the moment:
current master branch is a long running attempt to morph Babble's implementation of Swirlds into Lachesis. It has problem with consensus beyond 5 nodes. Due to that "morphing" it doesn't look like a well organised and clean code base;
so when @devintegral was tasked with PoS, they just re-implemented poset in more clean and organised way to the lachesis paper (though making changes into leaf events which I see deleterious), but put their implementation into separate posposet, so now additional work is required to test its ability to reach consensus on 15+ nodes.
Jiho and Sang-min's implementation has a different (pipelined) architecture and approach to a lachesis implementation, so I need to make a lot of reorganisation and cuts to redundant code, but having to share it with posposet implementation makes it even complex task.
The latest changes as of today are in this branch: https://github.com/Maxime2/lachesis/tree/20-mar-lachesis-clotho-atropos . It is a work in progress.
Having played with it I would say it works much faster than previous poset implementation, is reaching consensus to atropos on 15, 25, 35 and 65 nodes (juts have not checked on more nodes, but think it will work).
However we found several issues with lachesis using this implementation which need to be resolved before we could make final consensus to sort all events uniformly on all nodes:
I perceive any modifications to lachesis either it is dynamic participation either PoS premature at this stage as lachesis itself is not finalised and its architecture may be substantially changed in future development. In addition, I believe, these modification would need further research into correctness of these modification (e.g. these modifications do not break consensus and all nodes have the very same order of events in consensus).
BTW: May I ask for PoS task specification so I have it in mind when developing Jiho and Sang-min's prototype?
@Maxime2, you have not to share your efforts on Jiho and Sang-min's implementation with posposet implementation. Posposet is pretty independent package, and I'm in charge of its tests. All I want is announce my reseach direction and share results into master branch as soon as possible to prevent a double efforts.
So, what is your objection against both poset and posposet in master at the moment?
@Maxime2 please review the PR 161 and check in it.
For Sam's suggestion, I think it's not suitable at this stage. This repos is only for lachesis consensus and its variants.
Thanks for all your input.
@quan8 you say this repo is just for lachesis, but if you look through the codebase and its PRs, there is clearly more going on. Abstracting out the consensus layer will enable the other layers to evolve independently, as long as they maintain the same interface.
Additionally it will allow benchmarking and debugging with different versions, expansion of use-cases—implementing Raft and/or Paxos will enable private, trusted blockchain scenarios—and altogether better modularity.
There will be a major rearchitecture at some point, the only questions on the table are: how & when.
In terms of the mutually exclusive work by @Maxime2 and @devintegral, I would like to you both to suggest ways of extracting out this capability to 1 or 2 other repositories, and hiding it behind a conditional (so that we can switch between these two implementations). In the process, we will naturally need to make many of the dependent components more abstract. So would be good to get your insights, rather than just me going through with the risk of making many components over/under abstract.
Thanks
@devintegral , please announce your research direction, what is PoS task requirements and specification. If posposet is pretty independent package why it replicates all functionality of poset, - this doesn't not look like a good architectural decision if you want to avoid efforts duplication (putting aside it is an improved, in therms of code quality, version of lachesis known to have problem with consensus at scale and some correctness flaws).
So the main objections to posposet: it is premature optimisation and modification to the lachesis algorithm which came at time of other modifications into the same code aiming to solve found problem with lachesis algorithm; it seems was not tested if it can reach consensus on 15+ nodes; it seems slower than new prototype.
@SamuelMarks , before suggesting any changes in attempt to join (or split up) it is better to know what PoS task is about, its specification and requirements, which I am not aware yet. As well the information about other possible uses of lachesis would be appreciated and beneficial for lachesis development.
@Maxime2, you will get PoS task specification in section 7.1 of https://swirlds.com/downloads/SWIRLDS-TR-2016-01.pdf. My research direction is described at #161 header: "make possible to build decentralized trustless network".
To make a new src/posposet
package is the only just architectural decision. Modification of src/poset
seems not good because:
src/node
has overkill dependence on src/poset
. Hence, it should be refactored with posed together;
So, as previously mentioned, src/posposet
is not premature optimisation, it is research.
Agreed, src/posposet
should be tested better. And it will be done. But it does not make sence until your team aproved #161 research direction.
Posposet will make opportunity to compare it with your unpublished solution and combine the both into the best result.
@devintegral section 7.1 of https://swirlds.com/downloads/SWIRLDS-TR-2016-01.pdf is a generalisation of Swirlds. May I see your research document reflecting it into Lachesis, in particular, why PoS approach was reflected for voting but not for time calculation; and have you proved you have the same value of MinMax here https://github.com/Fantom-foundation/go-lachesis/blob/7ce8564514e144c4aa6bf3d634e383c96d55e0d6/src/posposet/poset.go#L303 on every node independently on selection of nodes forming the stake ?
You perhaps not get PoS task correctly, because it has nothing against fixed peer list, it assumes that each peer is not equal, and nothing else. So it could be predefined fixed peer list with assigned weight to each peer. Moreover cited paper says the PoS system is the same as original if you set weight 1 to each peer.
It is clear the re-run in case of stake change gives the biggest impact on performance when PoS is implemented. Have you done any research on possibility to avoid it?
Obviously the PoS specification doesn't require poset reengineering. I would prefer to change peers for that task rather than poset.
@Maxime2, move this question to #161, please. This issue is not about src/posposet
, it is about repository organization.
@devintegral moved.
outdated
It might be about time to split this repository in 12. One for each consensus, CLI client, common utilities, interfaces and protobuf files (including a common error proto).
Here is what I'm envisioning:
Proposed architecture:
Already done major work in this direction with many
crate
s inside a singleworkspace
in the Rust code base.Advantages:
interface
)Disadvantages:
Would welcome your thoughts.