Netflix / dynomite

A generic dynamo implementation for different k-v storage engines
Apache License 2.0
4.19k stars 531 forks source link

Plugin Architecture for New Protocols #85

Closed blakesmith closed 7 years ago

blakesmith commented 9 years ago

Hey there!

I've been interested in Dynomite and would like to work on supporting a new protocol with it. I've been reading through the Dynomite source code, and the code base seems to be coupled strongly to redis and memcache (code paths such as: https://github.com/Netflix/dynomite/blob/master/src/dyn_dnode_msg.c#L446 and others).

Given dynomite's lineage with memcache, with redis added later, I would like to send patches that decouple the codebase from redis / memcache explicitly and roughly follow this development path:

  1. Begin by decoupling the config from redis or not, and support proto as an arbitrary string in the config file. We can preserve backwards compatibility with the redis boolean if desired, but it'd be nice to deprecate it eventually if that's important to everyone.
  2. Decouple internal msg branches that explicitly check for the redis boolean and other places that explicitly branch on memcache / redis.
  3. Put in place an interface for a more flexible "Protocol Interface" and move redis and memcache to this interface.
  4. Move to allow dynamic linking to "Protocol Interfaces" that can be built outside of dynomite and decoupled from its build.
  5. Investigate seeds providers for bootstrapping on these protocol interfaces? This is an area I'm fuzzy on, but could be useful to have built-ins for Protocol Interfaces that ship with Dynomite built-in.

Is there anything else I'm missing? Does this sound like a worthwhile effort? Having Dynomite be easily pluggable to other protocols would be the end goal. Let me know what you think!

Blake

timiblossom commented 9 years ago

Hi Blake,

It is good to hear that you want to send out a patch. Yanis already started to work on MongoDb and you can watch his work here at https://github.com/ipapapa/dynomite. This should answer your questions #1, #2 and #3.

For #4, i think it probably needs that if some protocol requires a heavy dependency on other libraries. Currently it is not an issue but will probably be in the future.

For #5, seed provider is just a place to return a list of seed nodes (or all of the nodes if we don't need gossip here). Our current seed provider is dyn_seeds_provider is basically a mechanism to call a local REST interface. You can base/extend on that to pull out seeds from other sources such as Mysql, a S3 file or a data on a shared storage.

blakesmith commented 9 years ago

Thanks @timiblossom. The main thing I would like to push for is for the core of dynomite to be decoupled from the protocols as much as possible. It looks like the work being done at https://github.com/ipapapa/dynomite is more to add Mongo to the existing structure, instead of pushing for the dynomite core being protocol agnostic via a plugin architecture. Would you accept patches that push more towards this direction?

For #5: Sorry, I misunderstood the purpose of the seed providers. I thought a seed provider was to assist with data bootstrapping when the cluster grows / shrinks. How do you guys handle cluster growth in production?

timiblossom commented 9 years ago

"How do you guys handle cluster growth in production?"

-We do rolling node upgrade to replace with better hardwares -We can add one new node at a time but this process still needs more works to make sure new nodes still have warm-data that they own as well as existed nodes to let go of the data they don't own

The other option is to build a bigger cluster and let applications to do dual writes on both Dynomite clusters. After a few days, we can shut down the smaller Dynomite cluster and have the applications to fully use the new Dynomite cluster.

blakesmith commented 9 years ago

Gotcha, thanks @timiblossom. It sounds like your clusters are mostly used for caching. The use case I'm interested in for dynomite would require a more robust data bootstrap process, since the data would be the primary source of truth. Given the current system, it sounds like this might not be a good fit. Considering Redis, there are some out of band methods you could use to do reliable data bootstrap, but I'm not sure if that complexity would be a good fit for dynomite. What do you think?

timiblossom commented 9 years ago

Sorry to miss this. I think we can use it as the primary source of truth if you have enough redundant racks. Also you need to wait for our Read/write consistency features and probably the data reconcilation tools

hamiltop commented 9 years ago

I'm also interested in using it a primary store for data. It seems the current implementation doesn't even really heal after a network partition. Writes that occur during a partition are never replicated once the partition is healed.

Is there roadmap somewhere?

timiblossom commented 9 years ago

@hamiltop we are still trying to play around with some automatic reconnection optimization in different network edge cases. To manually force a reconnecting the connections from a node to another node, there is an admin command to do this:

curl 'http://127.0.0.1:22222/peer/reset/peer_ip' (replace peer_ip to an actual IP).

Or to forcefully reconnect all outbound connections in a node to all other nodes:

curl 'http://127.0.0.1:22222/peer/reset/all'

This will help to heal or re-establish outbound connections immediately without any waiting time.

hamiltop commented 9 years ago

Will it backfill missed writes?

On Sat, Apr 11, 2015, 5:54 PM Minh Do notifications@github.com wrote:

@hamiltop https://github.com/hamiltop we are still trying to play around with some automatic reconnection optimization in different network edge cases. To manually force a reconnecting the connections from a node to another node, there is an admin command to do this:

curl 'http://127.0.0.1:22222/peer/reset/peer_ip' (replace peer_ip to an actual IP).

Or to forcefully reconnect all outbound connections in a node to all other nodes:

curl 'http://127.0.0.1:22222/peer/reset/all'

This will help to heal or re-establish outbound connections immediately without any waiting time.

— Reply to this email directly or view it on GitHub https://github.com/Netflix/dynomite/issues/85#issuecomment-91955662.

timiblossom commented 9 years ago

That is on your roadmap to have a repair/backfill process.

ipapapa commented 9 years ago

@blakesmith We have made some changes to add different types of data stores. These changes were along the lines of the Mongo protocol addition. We would love to see your changes that decouple data stores from the dyn_core. The data bootstrap has been added several months ago. More information: the data bootstrap has been added several months ago.

ipapapa commented 7 years ago

There has not been much activity on this issue, so I am going to close it. If you have any more questions or want to receive updates of those features from Dynomite, we will soon post a mailing list or feel free to reopen the issue.