vmware-archive / haret

A strongly consistent distributed coordination system, built using proven protocols & implemented in Rust.
461 stars 18 forks source link

Breaking up haret into a few libraries? #115

Open erickt opened 7 years ago

erickt commented 7 years ago

In the why.md document, there is a discussion about how haret was designed to isolate off the protocol from the client-facing and data-storage parts of the system. Would there be any interest in formalizing this into multiple libraries? I'm personally interested in exploring a Zookeeper client wire-compatible frontend a la zetcd, so having a looser coupling between the subsystems would make this a bit easier to do.

andrewjstone commented 7 years ago

The intention of that section was to allow plugging in different protocol implementations, not to allow different APIs. We always intended the API to be a disjoint set of what zookeeper provides with some new primitives baked in, and other things intentionally left out. e.g instead of allowing ephemeral, auto-incrementing nodes which are often used for leader election, we'd provide a leader election primitive instead. The goal was to provide a ready-made, opinionated system to allow users to safely coordinate their systems, not to build a toolkit, as the toolkit approach makes it harder to use and debug the different combinations of setups in production.

In practice, isolating even the protocol hasn't been perfect. While most of the VR specific code lives in /src/vr, there are artifacts of the fact that Haret uses VR littered throughout the codebase For instance the namespace manager knows about VrCtxs and the 3 different start modes (startup, recovery, reconfiguration) of replicas. Since implementing a consensus system like VR on top of a lightweight process architecture requires a management layer like the namespace manager (using gossip in this case), in order to start replicas on different nodes and learn of new consensus groups after partition, it was easy to fall into this trap. Ideally this management layer would be agnostic of the consensus protocol as well, but I haven't spent the time to go back and fix it. It isn't high on the priority list right now, although it may help provide cleaner, more structured code.

I'm actually in the middle of a major refactoring of the FSMs that I hope to open a PR for in the next week or two. It's possible that all this code could live in it's own VRR library, but I'm not sure how useful it would be outside of Haret. I'm also hesitant to independently version it at this early state, when I'm the primary developer, as it just adds another layer of management for me. The reality of adding another consensus protocol at this juncture is very remote, so taking the time right now to do this is not a priority.

As far as decomposition of the system, a bunch of things are already in their own libraries. Haret relies on rabble for the cluster system and lightweight processes and vertree for the trie based backend.

It is possible to also abstract out the front end API, but it is harder, as the API is heavily tied to the capabilities of the backend. It is also useless in and of itself.

It appears that zetcd is an independent proxy process that sits in front of etcd. That doesn't require splitting up the code at all, but it does require features, such as subscriptions, that aren't yet built into Haret. It also will either require emulating other non-native features such as ephemeral nodes, or not implementing them altogether. That all seems doable, but again isn't really a priority for me right now. My chief goal is building a correct and stable system. After some stability it will be much more actionable to talk about extension and different front end APIs.

In summary, I'm not fully opposed to this idea, but feel it is a bit of a distraction at this early time. If however you see specific parts of the code that you feel are not properly abstracted and should be split out into their own libraries, I am definitely willing to consider that.

erickt commented 7 years ago

Hi @andrewjstone! You are welcome of course to want to move at your pace and turn all this down :) As I was starting to go through the code, it seemed like there was a natural decoupling between the interior communication between the nodes, and the client/server communication. At least for me, it seemed like it'd be a little easier to contribute on those portions of haret without needing to have a lot of understanding on how VR works. As best as I can tell, it doesn't seem to hard to pull the client/server out of the core library, and it has the nice benefit of reducing dependencies and revealing what needs to be public and private.

Regarding the zookeeper compatible client interface, that's more of a toy experiment to compare/contrast some workloads. I thought it might be a nice way to get some people from that community to pay some attention to the project. I don't think you should feel compelled to add any features to support it.

andrewjstone commented 7 years ago

Hi @erickt,

I can't tell you how much I appreciate you taking an interest in Haret. After looking at your changes to the cli-client lately and thinking more about this, I am less concerned about pulling things apart. I was never really that concerned about separating the code, but more about having to support multiple APIs. However, there is no reason I have to support multiple APIs :) Community projects are completely fine and reasonable

Additionally, you are correct that the client/server API part is well isolated from the internal communication, so separation shouldn't be that hard. As you state it is also certainly an easier way to start contributing. Furthermore, it could be very useful to have an HTTP interface using JSON in addition to protobuf. Implementing that for the admin client would be most useful in particular.

With all that said, have at it! I will be happy to review any changes you are interested in making, and from what I've seen so far will likely merge them in quickly. If you want to discuss complex things before implementation we can do that also.

Cheers!

jrgarcia commented 7 years ago

@erickt @andrewjstone Should this be closed now that things have been separated accordingly?

andrewjstone commented 7 years ago

Let's leave it open for now. I still want to split out the VR code and possibly other things into their own crates. I'm working on the successor to rabble which uses boxed Any. This will make separation of the internal code possible as right now everything is tightly coupled to the parameterized Msg type. I tried separating out the VR code a month or so back and ran into some serious hiccups. Another month or so and I hope to have everything ported over to the new system, and nicely separated.


From: J.R. Garcia notifications@github.com Sent: Tuesday, August 29, 2017 6:39:12 PM To: vmware/haret Cc: Andrew Stone; Mention Subject: Re: [vmware/haret] Breaking up haret into a few libraries? (#115)

@erickthttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_erickt&d=DwMCaQ&c=uilaK90D4TOVoH58JNXRgQ&r=EYqGjyxD08UEoHBE9BvW7xJFsytY9rePqghteUQ7CqE&m=EdVCmHeD-HB2Y7SY4Pk0iZjqtPZRXOZ0H9El4Tji5-Q&s=KH-RCpnZhJhn8jfUR8MFPqhMs95-T6M2S22BzdxXIwE&e= @andrewjstonehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_andrewjstone&d=DwMCaQ&c=uilaK90D4TOVoH58JNXRgQ&r=EYqGjyxD08UEoHBE9BvW7xJFsytY9rePqghteUQ7CqE&m=EdVCmHeD-HB2Y7SY4Pk0iZjqtPZRXOZ0H9El4Tji5-Q&s=Y21myBoxPsi6E4PREYR_HJBqLbWqCo1afF3cIs7oOsA&e= Should this be closed now that things have been separated accordingly?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_vmware_haret_issues_115-23issuecomment-2D325824803&d=DwMCaQ&c=uilaK90D4TOVoH58JNXRgQ&r=EYqGjyxD08UEoHBE9BvW7xJFsytY9rePqghteUQ7CqE&m=EdVCmHeD-HB2Y7SY4Pk0iZjqtPZRXOZ0H9El4Tji5-Q&s=A_cDseot2SWS-UtmsaQF3Zl1Y4TkwER5pStJwytLLlA&e=, or mute the threadhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AAcf-2DQLg7jxDm8en-2D3Z88-2DoeaJhbiYjOks5sdJMQgaJpZM4NzfEE&d=DwMCaQ&c=uilaK90D4TOVoH58JNXRgQ&r=EYqGjyxD08UEoHBE9BvW7xJFsytY9rePqghteUQ7CqE&m=EdVCmHeD-HB2Y7SY4Pk0iZjqtPZRXOZ0H9El4Tji5-Q&s=OlIl0XDn2UlevKp1TiYwoWcV2kr2b1gEiYRHbDPDiwo&e=.

jrgarcia commented 7 years ago

Sounds good. I was just looking through here to pick something up and came across this.