Closed skeggse closed 10 years ago
You're using a word that has very particular meaning for me :) "capability." I will assume that you do not mean "capability" within the context of Object Capability Model.
So assuming that, I think what you describe here could be created on top of the existing implementation.
Node id's don't have to be unique. That is, Discover is still a key/value DHT. So, for example, one could create a User Management service as a contact:
var contact = {
id: "VXNlciBNYW5hZ2VtZW50IFNlcnZpY2U=", // "User Management Service" in base64
data: {
"Email": "c29tZSBlbWFpbCBub2RlIGlk"
},
transport: { ... }
You'd lookup "VXNlciBNYW5hZ2VtZW50IFNlcnZpY2U=" node, which would simply be a directory of the capabilities you want with what nodes they correspond to.
The warning here is that a DHT is eventually consistent. And without the vector clock implementation (that I mention in the README), old data may overwrite new data. Come to think of it, vector clock implementation could solve this problem.
Do I understand your use case correctly?
Ahhh, I see. That's almost what I was looking for, except that I could have multiple Email services running simultaneously, and would want to get all of them from the User Management service, not just the one that has propagated to me.
The User Management Service contact would be a globally available node. So if Email would be an array, perhaps that would work?
As to getting the latest value... the vector clock idea would have to be fullly implemented, but you would still have a propagation delay. This is because every node would assume it had the latest answer (until it got the update). But there is no guarantee that everyone would get the update. It gets difficult and starts to go outside the use case I think :/
Closing due to inactivity.
A while ago I built exactly what this module is not: a central discovery system. It works for my purposes, for now, but I can see it becoming a big problem pretty quickly. I'd like to use this module, but I don't understand how.
Discovery "enables point-to-point communications," which kinda makes sense but what I need is to, given a dozen node processes (for example), find the ip addresses and configuration data for a subset of those nodes based on some search parameters.
The email and user service examples are a little obscure, so I'll try to expand upon them.
I have a node process which exposes a number of operations, including createUser, updateUser, deleteUser, and similar operations. These, internally, interact with a database of sorts, and externally provide an API for interacting with user objects.
If I have another node process which sends emails. The user service wants to be able to send an email to the user when the user is created. The purpose of the services here is that the service can horizontally scale--I might have four node processes which can send emails. Given that the user service wants to send an email, it needs to figure out how to communicate with one or more of those email services. Naturally, the user service does not have a node id for the email service, which is what it needs to find.
So, given ten processes, five of which are user "services" and five of which are email "services," how can the user services find out about the email services? I'm not worried about the actual communication--that can be taken care of with HTTP or some inter-process transport.
Is this functionality discovery can or should provide? If not, can discover act as a building block towards that goal?
Ah, perfect, thank you for elaborating on what you mean. I think I now better understand the use case you're after :)
Discover should be able to be used for this purpose. If I were faced with this problem, I would design it as follows.
Discover handles contact
objects. There is contact.id
, which is what you can look for (i.e. it's the key), and contact.data
, which is the value that is stored and can be anything you want (that can be turned into JSON). To support your usage, the contact.id
could be "sendEmail". The contact.data
could be an array of addresses that can perform the "sendEmail" action. For example:
var contact = {
id: "sendEmail",
data: [ {host: 10.0.0.1, port: 1234}, {host: 10.0.0.2, port: 1234} ],
vectorClock: 0
};
discover.register(contact);
When you want to update
var contact = {
id: "sendEmail",
data: [ {host: 10.0.0.1, port: 4321}, {host: 10.0.0.2, port: 1234} ], // change something
vectorClock: 1 // increment the vector clock
};
discover.register(contact);
The vectorClock
aspect of it is still experimental, but it should (maybe) work? It's next on the implementation roadmap to verify and make sure it does what I think it does (I'm including it in Implementation Correctness part of the roadmap).
The elephant in the room for vectorClock
is that now we are talking about doing distributed consistency across many servers. Discover works best if only one node or entity is responsible for a particular contact.id
at the time. That is, there is a one-to-one mapping between contact.id
and anything that will increment vectorClock
. This way, if it gets updated, the update is guaranteed to come from only one place and increment the vectorClock
. If the update is to come from multiple nodes at once, they will likely step on each other and put the system in an inconsistent state.
So, in this particular use case. The full design would probably also include an "orchestrator".
The "sendEmail" nodes would never register themselves directly with Discover. They could use Discover to find "sendEmail-orchestrator". And this should be only one machine. This machine is responsible for updating the "sendEmail" information. The flow would be something like:
From a distributed systems perspective, what we are doing with the "orchestrator" setup is linearizing our "writes" (updates to the "sendEmail" contact) while distributing our "reads" (queries for the "sendEmail" contact).
@tristanls: wow. Many thanks for helping me understand this. I took some time to understand vector clocks, and now the concept of having an array of contacts starts to make sense. I now feel that I have a far stronger understanding of what discover is and does.
I have a few concerns about this implementation I'd like to highlight and ask for feedback on, though.
I get linearizing writes, it makes a lot of sense, keeps the vector clock from growing and desynchronizing, and has a lot of other benefits. It does, however, add a point of contention. If the orchestrator dies, there's no way to add nodes. If everything were static once initialized, it would only affect the creation of new nodes, but since nodes can change their contact, this becomes more of a blocker. I'd like to eliminate a single point of failure, so I'm wondering how this could be improved, but this has implications I'll cover in a little bit.
The speed at which I can look up all sendEmail nodes is critical in this particular application, and in the current system every user node knows about all other sendEmail nodes and can load-balance between them autonomous of everything else. When a sendEmail node is discovered or lost, all user nodes update their load-balancing arrays with the addition or removal of that sendEmail node. If I understand correctly, each time I want to interact with a sendEmail node I'd have to find
all the nodes and choose one. How could this work in another way? I'd like to keep this distributed and reliable/eventually consistent/highly available, so it might be a hybrid solution, but being able to quickly find other nodes significantly speeds up the rate at which operations can occur.
Currently all nodes generate a unique 32-byte identifier at startup, which they use to identify themselves with the central discovery server. That identifier can then be linked to the services they provide--in my particular implementation a node can be both a sendEmail node and a user node, though that's a little less important.
This is a really really minor concern but it would be kinda nice if the writes could be distributed instead of linearized. That would require a more complicated vector clock, and might ultimately not be worthwhile.
How could the above work in the context of discover? While on the one hand it seems a little odd for a DHT system to store all values of a certain group in all nodes, on the other hand it means that all values outside of the groups relevant to that node don't need to be stored. In other words, while the user nodes need to keep trace of all sendEmail nodes, they don't need to keep track of the contacts for any other kind of node.
This is more of a general question, but if I start a couple nodes, how do they find each other? When I start a subsequent node, how does it find one of the others? Ideally something like UDP broadcast would come into play, but AWS does not support UDP broadcast or multicast for whatever reason.
Again, thanks for your support on this, if it comes to it I'd love to help implement whatever's necessary to get here.
A thought occurred to me about the vector clocks and distributed writes, I'll see if I can write (haha puns woops) it down before I lose it.
I really need a distributed indexed collection of items with optimistic retainment.
If the contact
's data
were structured like a hash, it might make this a little easier to explain. The keys of this hash are the 32-byte identifiers of a given node.
{
"id": "sendEmail",
"data": {
"361439b..2022f70": {
"vectorClock": 0,
"host": "192.168.1.1",
"port": 9555
},
"4c34b8d..b3ab765": {
"vectorClock": 0,
"host": "192.168.1.1",
"port": 9556
},
"df22936..07c2202": {
"vectorClock": 0,
"host": "192.168.1.2",
"port": 9555
}
}
}
For my purposes, it's enough to restrict changes to this hash such that a given node may alter only the item in the hash that corresponds to its identifier--361439b..2022f70
may only alter the 361439b..2022f70
key in the hash. Then, every item in the has can have its own vector clock that "belongs" to its creator/maintainer node (e.g. 361439b..2022f70
). That maintainer node may alter to delete its item in the hash, but if someone thinks the node has crashed it can pessimistically change the hash by deleting it. If, of course, someone else gets an update after that point, the node's item would be reinstated (hence the optimistic retainment requirement in the collection definition above).
It's also important to note that this change moves the vector clock from the contact object to each item in its data object.
The whole purpose of representing it this way is to point out that really each contact needs to be doubly-indexed: once by its contact name, like "sendEmail," and once by its unique identifier, like "361439b..2022f70."
Does that all sound reasonable? It's starting to feel like this wouldn't fit in the scope of discover, so I'm not sure where that leaves me.
Side question: have you come up with a reliable way of determining the local ip address a local process may advertise such that other nodes can contact it? I've resorted to using the socket.remoteAddress
property on the receiving end, but that means that the node itself isn't allowed to know what its own address is...
I'll try to answer one item at a time..
You are absolutely right that an "orchestrator" would be a single point of failure. It bothers me too. I was thinking of that solution as a way to implement your use case without modifying Discover.
Some history of why Discover is the way it is. I didn't have your use case in mind. I was thinking of globally unique entities needing to find each other. With global uniqueness, I did not have the problem of write contention, so that simplification allowed me to get this far with Discover. Also, with global uniqueness, the vector clock mechanism was all that was needed to implement updates to the system, again, because of lack of contention for writes. Having said that, your use case is a common one and I think with some modifications to Discover it could be made more general to support what you have in mind, and that would be a big win.
Everywhere vectorClock
is used in Discover, it is in a scope of a sort of equality, or version check. The vectorClock
is also used in tristanls/k-bucket where it is also used as a sanity/equality check. However, now looking back at it, this is a hard coded concept of conflict resolution. I think that refactoring this and making it pluggable could be a way to move forward with better support for what you have in mind.
For example, refactoring this concept out, we could plug in CRDT functionality into Discover. What you describe in your optimistic retainment scheme is a type of CRDT. Merging conflicting CRDTs ends up being.. drumroll.. a type of conflict resolution. So refactoring vector clocks wherever they are in code (in Discover and k-bucket) and making room for a pluggable mechanism, could be a clean sweep to support many use cases beyond what you and I have in mind.
Discover emits telemetry events showing how long it takes to do a lookup. I've implemented LRU cache in order to speed up certain use cases. If you look for a contact.id
, it will be stored locally.
You can listen for stats events. For example, npm run localtest | grep ~stats
gives a sample of what you can see.
For a more fully baked example, you can take a look at tristanls/tart-ansible project that also uses Discover. Without getting into a bunch of details there, once you npm install
it, you can run npm run readme
and you'll see Discover stats being dumped to the console. In that project, you'll see a lot of initial latency as the system bootstraps, but then most lookups result in 0ms (local) latency (again, look for lines starting with ~stats
). Sample output npm run readme | grep ~stats
below:
~stats: discover.stats.timers.find.request.ms 14
~stats: discover.stats.timers.find.ms 17
~stats: discover.stats.timers.find.round.ms 14
~stats: discover.stats.timers.find.request.ms 13
~stats: discover.stats.timers.find.request.ms 14
~stats: discover.stats.timers.find.request.ms 13
~stats: discover.stats.timers.find.request.ms 13
~stats: discover.stats.timers.find.ms 13
~stats: discover.stats.timers.find.round.ms 13
~stats: discover.stats.timers.find.request.ms 14
~stats: discover.stats.timers.find.request.ms 14
~stats: discover.stats.timers.find.ms 14
~stats: discover.stats.timers.find.round.ms 14
~stats: discover.stats.timers.find.request.ms 14
~stats: discover.stats.timers.find.request.ms 14
~stats: discover.stats.timers.find.request.ms 14
~stats: discover.stats.timers.find.request.ms 14
~stats: discover.stats.timers.find.request.ms 14
~stats: discover.stats.timers.find.request.ms 14
~stats: discover.stats.timers.find.request.ms 15
~stats: discover.stats.timers.find.round.ms 15
~stats: discover.stats.timers.find.ms 15
~stats: discover.stats.timers.find.request.ms 15
~stats: discover.stats.timers.find.round.ms 15
~stats: discover.stats.timers.find.request.ms 4
~stats: discover.stats.timers.find.request.ms 4
~stats: discover.stats.timers.find.ms 21
~stats: discover.stats.timers.find.round.ms 4
~stats: discover.stats.timers.find.request.ms 4
~stats: discover.stats.timers.find.ms 2
~stats: discover.stats.timers.find.request.ms 14
~stats: discover.stats.timers.find.request.ms 15
~stats: discover.stats.timers.find.ms 15
~stats: discover.stats.timers.find.round.ms 15
~stats: discover.stats.timers.find.request.ms 15
~stats: discover.stats.timers.find.request.ms 14
~stats: discover.stats.timers.find.request.ms 16
~stats: discover.stats.timers.find.request.ms 16
~stats: discover.stats.timers.find.request.ms 15
~stats: discover.stats.timers.find.request.ms 15
~stats: discover.stats.timers.find.ms 15
~stats: discover.stats.timers.find.round.ms 15
~stats: discover.stats.timers.find.request.ms 17
~stats: discover.stats.timers.find.round.ms 17
~stats: discover.stats.timers.find.ms 18
~stats: discover.stats.timers.find.request.ms 16
~stats: discover.stats.timers.find.request.ms 16
~stats: discover.stats.timers.find.request.ms 18
~stats: discover.stats.timers.find.request.ms 18
~stats: discover.stats.timers.find.request.ms 18
~stats: discover.stats.timers.find.round.ms 18
~stats: discover.stats.timers.find.request.ms 19
~stats: discover.stats.timers.find.round.ms 19
~stats: discover.stats.timers.find.ms 20
~stats: discover.stats.timers.find.request.ms 6
~stats: discover.stats.timers.find.ms 23
~stats: discover.stats.timers.find.round.ms 6
~stats: discover.stats.timers.find.request.ms 6
~stats: discover.stats.timers.find.request.ms 6
~stats: discover.stats.timers.find.ms 0
~stats: discover.stats.timers.find.ms 0
~stats: discover.stats.timers.find.request.ms 3
~stats: discover.stats.timers.find.ms 4
~stats: discover.stats.timers.find.round.ms 4
~stats: discover.stats.timers.find.request.ms 4
~stats: discover.stats.timers.find.request.ms 5
~stats: discover.stats.timers.find.ms 0
~stats: discover.stats.timers.find.request.ms 4
~stats: discover.stats.timers.find.ms 5
~stats: discover.stats.timers.find.round.ms 5
~stats: discover.stats.timers.find.request.ms 5
~stats: discover.stats.timers.find.request.ms 5
~stats: discover.stats.timers.find.request.ms 3
~stats: discover.stats.timers.find.ms 4
~stats: discover.stats.timers.find.round.ms 3
~stats: discover.stats.timers.find.request.ms 4
~stats: discover.stats.timers.find.request.ms 4
~stats: discover.stats.timers.find.ms 0
~stats: discover.stats.timers.find.ms 0
~stats: discover.stats.timers.find.ms 0
~stats: discover.stats.timers.find.ms 0
~stats: discover.stats.timers.find.ms 0
~stats: discover.stats.timers.find.request.ms 3
~stats: discover.stats.timers.find.ms 4
~stats: discover.stats.timers.find.round.ms 4
~stats: discover.stats.timers.find.request.ms 4
~stats: discover.stats.timers.find.request.ms 4
~stats: discover.stats.timers.find.ms 0
~stats: discover.stats.timers.find.ms 0
~stats: discover.stats.timers.find.ms 0
~stats: discover.stats.timers.find.ms 0
~stats: discover.stats.timers.find.ms 0
~stats: discover.stats.timers.find.ms 0
~stats: discover.stats.timers.find.ms 0
~stats: discover.stats.timers.find.ms 0
~stats: discover.stats.timers.find.ms 0
~stats: discover.stats.timers.find.request.ms 3
~stats: discover.stats.timers.find.ms 4
~stats: discover.stats.timers.find.round.ms 4
~stats: discover.stats.timers.find.request.ms 4
~stats: discover.stats.timers.find.ms 0
~stats: discover.stats.timers.find.ms 0
~stats: discover.stats.timers.find.ms 0
~stats: discover.stats.timers.find.ms 0
~stats: discover.stats.timers.find.ms 0
~stats: discover.stats.timers.find.ms 0
~stats: discover.stats.timers.find.ms 0
~stats: discover.stats.timers.find.ms 0
~stats: discover.stats.timers.find.ms 0
~stats: discover.stats.timers.find.ms 0
~stats: discover.stats.timers.find.ms 0
~stats: discover.stats.timers.find.ms 0
~stats: discover.stats.timers.find.ms 0
~stats: discover.stats.timers.find.ms 0
~stats: discover.stats.timers.find.ms 0
~stats: discover.stats.timers.find.ms 0
~stats: discover.stats.timers.find.ms 0
...
I don't think the names of Discover nodes themselves are relevant. I don't actually use them for anything, they simply need to be unique. For the stuff that I actually care about I use discover.register(contact)
and then somewhere else discover.find(contact.id)
.
As far as having nodes responsible for multiple things. Remember that only the contact.id
needs to be unique if you don't want to accidentally override information. So there is nothing preventing you from having the same data
content for "sendEmail", "createUser", "deleteUser". All of them can point to the same nodes.
In regards to where the stuff is stored. The whole point of Discover is to distribute the data. I wouldn't think that user nodes need to keep track of only email nodes. I view it as a desirable property that the information be spread out to as many nodes as possible. This improves the resiliency of the system.
Alternatively, if you feel strongly about it, you could simply set up a separate Discover DHT for your user nodes and a separate ones for other nodes.
This is the chicken and the egg problem of distributed systems. I have no improvements to offer over a list of seeds. So, there is assumed prior knowledge that some 3 or more nodes always exist. You could delegate this list of seeds to be looked up on startup in some reliable oracle service (like S3).
A more dynamic approach does not get rid of the initial bootstrap, but later on, when the system is running, you could query Discover to enumerate all the information it has (although I don't think I have an API for this). And then use that as a list of seeds for whatever node is standing up. This way, as you get more and more nodes, their seed lists are being updated. This might be something interesting to do in the future, although I think we can get by with seed lists in the meantime.
I typically rely on the operating system to tell the node process on start what it is (if that's necessary). If you are on AWS you can query instance metadata.
@skeggse, version 0.4.0 should now enable you to implement what you're looking for. Please provide feedback when you get a chance.
@tristanls thanks, that looks awesome! I'll see if I can start experimenting with it tomorrow.
Closing. Follow up took place in #11.
I'm not sure if this fits in the project's vision, but it would be nice if I could search for nodes by capability. For example, I could have a User Management service, and want to talk to the Email service. I wouldn't know the node's unique identifier, but the node would know its capabilities.