Open GeorgeJahad opened 5 years ago
On the other hand maybe rebalancing is a premature optimization at this point, and a simple workq is sufficient.
Thinking about current services that don't rebalance and the problems it causes us:
I'd rather we consider it in the design early than try to force it in later.
@GeorgeJahad , @jjbuchan , some quick notes for now and I'll review the code questions later on today or Monday.
WorkAllocator specific comments:
I'm confused by this is test: https://github.com/itzg/try-etcd-work-allocator/blob/master/src/main/java/me/itzg/tryetcdworkpart/services/WorkAllocator.java#L404 We own the active key, so don't we already know it exists?
Yes we do, so it's an "always true" condition just to force the transactional put+delete in the "then" condition. Doing this in a transaction is probably overkill, but seemed like a logical grouping of operations in any case.
Just to be clear, version "0" here means it doesn't exist?: https://github.com/itzg/try-etcd-work-allocator/blob/master/src/main/java/me/itzg/tryetcdworkpart/services/WorkAllocator.java#L488
Yes and just added a new unit test to verify the assumption https://github.com/itzg/try-etcd-work-allocator/blob/e21fa9074034073f16f0fc76b088687def6103b7/src/test/java/me/itzg/tryetcdworkpart/services/WorkAllocatorTest.java#L377
If the transaction failed, but we stop processing here, then we still have the active key, so no one else is going to process either, right? That seems like a problem: https://github.com/itzg/try-etcd-work-allocator/blob/master/src/main/java/me/itzg/tryetcdworkpart/services/WorkAllocator.java#L423
Yeah, the code was already feeling complicated and the failure case so rare (other than etcd failure), so I punted with any kind of recovery logic here. We could add retry logic, but I suspect etcd client is already doing that. I tried to use warning logs sparingly and intend that each be a cause for human investigation.
Similarily, this gets popped even if the txn fails: https://github.com/itzg/try-etcd-work-allocator/blob/master/src/main/java/me/itzg/tryetcdworkpart/services/WorkAllocator.java#L369
Good point. Perhaps the work ID should be pushed back if the releaseWork
calls completes with a false
.
Don't we have to guard these in a try/finally: https://github.com/itzg/try-etcd-work-allocator/blob/master/src/main/java/me/itzg/tryetcdworkpart/services/WorkAllocator.java#L391
My assumption is that exceptions during etcd will get passed to this handle
call where the semaphore is released
https://github.com/itzg/try-etcd-work-allocator/blob/master/src/main/java/me/itzg/tryetcdworkpart/services/WorkAllocator.java#L419
we are starting from rev 0 here. will that lead to a lot of unneeded work? https://github.com/itzg/try-etcd-work-allocator/blob/master/src/main/java/me/itzg/tryetcdworkpart/services/WorkAllocator.java#L252 https://github.com/itzg/try-etcd-work-allocator/blob/master/src/main/java/me/itzg/tryetcdworkpart/services/WorkAllocator.java#L330
Negative and zero are special values that instruct the watch to start from newest revision: https://github.com/etcd-io/jetcd/blob/859ba2771afcc73b27a5916c4e0969606f688d9d/jetcd-core/src/main/java/com/coreos/jetcd/options/WatchOption.java#L55
Yes we do, so it's an "always true" condition just to force the transactional put+delete in the "then" condition.
My question is, can't you just remove that call to the if() method, if it is always true? In other words, doesn't etcd txn()'s allow then()'s without if()'s?
Yeah, the code was already feeling complicated and the failure case so rare (other than etcd failure), so I punted with any kind of recovery logic here.
In that case, since the work stops no matter what, why does the workload get incremented again on failure here?: https://github.com/itzg/try-etcd-work-allocator/blob/master/src/main/java/me/itzg/tryetcdworkpart/services/WorkAllocator.java#L431
My assumption is that exceptions during etcd will get passed to this handle call where the semaphore is released
ok, thanks
Negative and zero are special values that instruct the watch to start from newest revision
Ah, good to know, thanks.
Thanks @GeorgeJahad
My question is, can't you just remove that call to the if() method, if it is always true? In other words, doesn't etcd txn()'s allow then()'s without if()'s?
Oh interesting, you can indeed have a transaction execute the Then
block without an If
:
https://github.com/itzg/try-etcd-work-allocator/commit/a257767a727bd76bff459edbbf317e139552403e#diff-9aabd8dc24134c3e25e37c2b3bad5907R399
I have updated that code to use the "always true" if-condition.
In that case, since the work stops no matter what, why does the workload get incremented again on failure here?: https://github.com/itzg/try-etcd-work-allocator/blob/master/src/main/java/me/itzg/tryetcdworkpart/services/WorkAllocator.java#L431
Since the transaction failed, then we know the work load stored in our active key is the non-decremented value. The incremented value was to get it stay consistent with the etcd stored value in the active key; however, now I can see that it could leave the value alone and re-store the correct value on the next operation.
I changed the referenced code and grabWork
here
https://github.com/itzg/try-etcd-work-allocator/commit/ae841a1b55c876e563bafe326db4e2534655a5d5
Atomix
First, I looked at the workq described here: https://atomix.io/docs/latest/api/io/atomix/core/workqueue/WorkQueue.html
While it does some of what the WorkAllocator does, it doesn't seem to address the cross worker rebalancing problem that WorkAllocator implements. If we want that, then we still have to implement much of what is in WorkAllocator. On the other hand maybe rebalancing is a premature optimization at this point, and a simple workq is sufficient.
However, Atomix workq also seems to require some cluster configuration, because the workq is implemented over a cluster created by the user nodes. Since we are already doing that for etcd, it seems a bit redundant.
Bottom line, Atomix is not a clear winner to me, (although it probably would be if it was built on top of etcd,) but I wouldn't object to using it if you others felt differently.
I've googled around some but haven't found a better etcd specific alternative to Atomix workq's.
WorkAllocator specific comments:
I'm confused by this is test: https://github.com/itzg/try-etcd-work-allocator/blob/master/src/main/java/me/itzg/tryetcdworkpart/services/WorkAllocator.java#L404 We own the active key, so don't we already know it exists?
Just to be clear, version "0" here means it doesn't exist?: https://github.com/itzg/try-etcd-work-allocator/blob/master/src/main/java/me/itzg/tryetcdworkpart/services/WorkAllocator.java#L488
If the transaction failed, but we stop processing here, then we still have the active key, so no one else is going to process either, right? That seems like a problem: https://github.com/itzg/try-etcd-work-allocator/blob/master/src/main/java/me/itzg/tryetcdworkpart/services/WorkAllocator.java#L423 Similarily, this gets popped even if the txn fails: https://github.com/itzg/try-etcd-work-allocator/blob/master/src/main/java/me/itzg/tryetcdworkpart/services/WorkAllocator.java#L369
Don't we have to guard these in a try/finally: https://github.com/itzg/try-etcd-work-allocator/blob/master/src/main/java/me/itzg/tryetcdworkpart/services/WorkAllocator.java#L391 https://github.com/itzg/try-etcd-work-allocator/blob/master/src/main/java/me/itzg/tryetcdworkpart/services/WorkAllocator.java#L474
we are starting from rev 0 here. will that lead to a lot of unneeded work? https://github.com/itzg/try-etcd-work-allocator/blob/master/src/main/java/me/itzg/tryetcdworkpart/services/WorkAllocator.java#L252 https://github.com/itzg/try-etcd-work-allocator/blob/master/src/main/java/me/itzg/tryetcdworkpart/services/WorkAllocator.java#L330
introductory Workallocator documentation that would have helped me.
The code in this module allows processes to add/update/delete tasks to a distributed workq and allows the workers processing the q to be aware of each other and increase/reduce their share of the total work, as workers come and go.
data structures
The workq is implemented with a set of etcd kv's prefixed with the name "registry". Each new task in the q is a new kv pair with that prefix.
As workers pick up tasks from the q, they create an analagous entry in etcd with the prefix "active". That way the other workers know the task is in progress.
Finally, there is a set of kv pairs with the prefix "workers". This allows the workers to know of the existance of the others so they can shed work if the load becomes unbalanced.
threads
This module manages the work/worker q's by creating three threads, (each of which watches one of the prefixes listed above):
The registry thread watches for incoming work in the "registry" kv pairs and "grabs" it in a etcd transaction to ensure no two workers grab the same task.
The worker thread checks for incoming workers; as new workers arrive, it determines what should be it's new share of the total load and sheds tasks when it determines it has too many.
Finally, the active thread checks for tasks shed by other processes and attempts to grab them if the current process is underloaded.