Closed SteVwonder closed 3 years ago
The other license discussion reminds me of this issue:
Yes, the main thing will be https://github.com/flux-framework/flux-sched/blob/master/src/common/librbtree/rbtree.h
Don't know if there will be an implementation that will bring the both performance and robustness as this has been vetted in Linux OS kernel. Maybe simple std::map can do things... At least the direct uses of rbtree
is contained within the planner layer so pretty isolated.
Talked with a couple of people. If we were to go with another licensing, we will probably need a replacement of this red-black implementation.
If we want to get flux-sched as a whole to LGPL, we will need to replace readline as well, with maybe something like editline: http://thrysoee.dk/editline/
EDIT: credit goes to @trws for pointing this out
EDIT2: editline is a drop-in replacement for us as it implements the readline
function: https://github.com/troglobit/editline#example
AFAIK, this would require finding an LGPL-compatible version of the red-black tree currently in use.
In flux-framework/flux-core#3639, I am proposing to bring in a couple components from CCAN. Although it has a red-black tree implementation that is different from the LK one pulled into sched, it is licensed under GPLv3.
CCAN does have an AVL tree that is BSD-MIT licensed. Since AVL is another self-balancing tree, maybe it could be used in place of the red-black in sched? Disclaimer: I have not compared the interfaces to see whether this is a good fit.
In case that interface doesn’t happen to fit, there’s also the bsd red-black tree here: https://github.com/lattera/freebsd/blob/master/contrib/unbound/util/rbtree.h
Or the classic BSD sys/tree.h which is distributed on some linux distros too now: https://www.freebsd.org/cgi/man.cgi?query=tree&sektion=3&manpath=freebsd-release-ports
If we’re lucky maybe one of them matches up well.
On 9 May 2021, at 13:14, Jim Garlick wrote:
AFAIK, this would require finding an LGPL-compatible version of the red-black tree currently in use.
In flux-framework/flux-core#3639, I am proposing to bring in a couple components from CCAN. Although it has a red-black tree implementation that is different from the LK one pulled into sched, it is licensed under GPLv3.
CCAN does have an AVL tree that is BSD-MIT licensed. Since AVL is another self-balancing tree, maybe it could be used in place of the red-black in sched?
Disclaimer: I have not compared the interfaces to see whether this is a good fit.-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://urldefense.us/v3/__https://github.com/flux-framework/flux-sched/issues/384*issuecomment-835878332__;Iw!!G2kpM7uM-TzIFchu!kNIZXfyumYwR9_qiAKQnuxpc69DFp5W3ZsNmJnII-N2tEF6Tpnzt3WvBXIizkrYg9g$
I haven't looked those interfaces but the main challenge would be to find one that will support augmented RB Tree: see "interval tree section" -- https://www.kernel.org/doc/Documentation/rbtree.txt.
BTW, though not ideal, this should still be compatible with LGPL v3: https://www.gnu.org/licenses/license-list.en.html
GNU Lesser General Public License (LGPL) version 3 (#LGPL) (#LGPLv3)
<CUT>
Please note that LGPLv3 is not compatible with GPLv2 by itself. However, most software released under GPLv2 allows you to use the terms of later versions of the GPL as well. When this is the case, you can use the code under GPLv3 to make the desired combination. To learn more about compatibility between GNU licenses, please see our FAQ.
And vendored in RB tree from LK is:
Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
GNU readline is GPL v3: https://tiswww.case.edu/php/chet/readline/rltop.html
So this should be compatible with LGPL v3.
Not ideal but at least compatible.
You can use LGPL things in GPL code without issue, if you use GPL code in LGPL code then the whole thing becomes GPL. No GPL code can be allowed to link in, or the project can't be LGPL.
That's the "though not ideal" part.
Briefly skimmed through https://github.com/lattera/freebsd/blob/master/contrib/unbound/util/rbtree.h
Doesn't seem to support augmented RB tree.
And use of pointer to call a comparator each time a node is visited will be likely not a match w/ LK rbtree https://github.com/lattera/freebsd/blob/master/contrib/unbound/util/rbtree.h#L87....
Just if C++ std:map opens up its RB tree to be augmented....
I mean, I suppose that's a way to look at it? The project fundamentally cannot be LGPL with a chunk of GPL code linked into it, so it's completely incompatible with the project's license. Our license is compatible with the tree's license, but theirs is not with ours.
If what you want is an interval tree, there's always these:
Ideally, an efficient rbtree library with an interface to augment that to store subtree properties tree will do. planner
uses both a regular search tree to do scheduled point's time base searches and a resource state based search using an "augmented" rbtree.
I will take a look at all the suggestions.
I did some search and haven't been able to find a right implementation with a right interface of an augmented red black tree, I am temping to develop a simple one myself. (Will do some more search though -- e.g., carving out RB tree from C++ standard library and augment it.)
So... Ok, what they're calling an "augmented" rbtree is a manually traversed rbtree with the option to add extra data to the node, which all of the macro-based ones have. Getting that behavior in a C++ tree is a little different, because you normally don't write your own traversals for map or set-like containers in C++. Going with something like the classic Judy Array might work, where you can very efficiently find the nearest higher, or lower, element to a given value. It actually as part of its nature stores the min/max of stored values in the higher level nodes, so you get this kind of thing for free and actually get amortized O(1) access.
Alternately one of the normal map or interval-map containers would work. If you use the lower_bound and upper_bound methods to find elements you'll get something functional, but not as performant, with an stl map.
There's also the less well known but still quite useful algorithms for creating, maintaining and manipulating heaps, though that's usually more for managing priority queues than ranges you could probably set up a comparison function and update function such that you could get the behavior you want out of it.
Thanks @trws.
Getting that behavior in a C++ tree is a little different, because you normally don't write your own traversals for map or set-like containers in C++
Exactly. They don't expose that. To overcome the performance disadvantage, they often inline the comparator function though.
Augmented rbtree algorithm is pretty well known and proven to be effective: e.g., at https://fileadmin.cs.lth.se/cs/Personal/Rolf_Karlsson/lect3.pdf
It allows you to add an additional data to each binary search node to describe a user-defined function on the subtree rooted at each node. And this is maintained as part of the tree itself through rebalancing actions (e.g., rotations). This means, though, you need to hook in your callback functions on these rebalancing events to help maintain your user-defined subtree invariant though tree rebalancing. I used this to make "available resource amounts" search fast: what is the earliest schedulable point given x amounts of resources required?
The planner code leverages such mechanisms and interfaces and the rest of the scheduler code heavily relies on the planner's efficiency. So, my hope is not to perturb that code too much if possible.
On 11 May 2021, at 9:45, Dong H. Ahn wrote:
Thanks @trws.
Getting that behavior in a C++ tree is a little different, because you normally don't write your own traversals for map or set-like containers in C++
Exactly. They don't expose that. To overcome the performance disadvantage, they often inline the comparator function though.
Augmented rbtree algorithm is pretty well known and proven to be effective: e.g., at https://urldefense.us/v3/__https://fileadmin.cs.lth.se/cs/Personal/Rolf_Karlsson/lect3.pdf__;!!G2kpM7uM-TzIFchu!h_f9JkfcS9DjKTQ5JE3NHmoNtLSEYMleRAaOe2kykjSi6Ut9p_2ytskoJuOPxt4Bjg$
It allows you to add an additional data to each binary search node to describe a user-defined function on the subtree rooted at each node. And this is maintained as part of the tree itself through rebalancing actions (e.g., rotations). This means, though, you need to hook in your callback functions on these rebalancing events to help maintain your user-defined subtree invariant though tree rebalancing. I used this to make "available resource amounts" search fast: what is the earliest schedulable point given x amounts of resources required?
Quite so. I’m not arguing whether it’s something useful or not, but there are types that have been designed on top of that paradigm to provide higher level interfaces that use this practice internally. The Judy Array does this for example, tracking the number, min and max of elements under any given node and applying it to search, traversal and neighbor lookups. This is also a common mechanism used for interval trees and segment trees as used in a variety of contexts to provide efficient range queries and updates.
The planner code leverages such mechanisms and interfaces and the rest of the scheduler code heavily relies on the planner's efficiency. So, my hope is not to perturb that code too much if possible.
If you want it to look like the current version, and be able to do exactly the same thing but licensed properly and in c++, there’s always boost::intrusive::rbtree and the associated algorithms: https://www.boost.org/doc/libs/1_65_0/doc/html/boost/intrusive/rbtree.html
If necessary it provides everything for manual traversals, as well as many standard algorithms, and can inline anything and everything. It also has a large number of compatible alterantive balanced trees and so-forth, so maybe that’s a better fit. Come to think of it, it’s even the same kind of intrusive container as the LK RB tree is.
Yeah I was looking at that as well last night. The concept of intrusive containers was a bit interesting and was pretty ambivalent whether you can live with this type of containers that don't explicitly manage the node data...
Now you are mentioning this, I will spend some more time to take a look.
Come to think of it, it’s even the same kind of intrusive container as the LK RB tree is.
Interesting observation.
https://www.boost.org/doc/libs/1_65_0/doc/html/boost/intrusive/rbtree.html
Ok. I took a look at this a bit more. It shares the design concept with LK rbtree in terms of adding one or more hooks into the date type instead of managing the data memory by itself. But unfortunately, the current abstraction doesn't expose/provide methods to add the augmentation. Maybe there is something in the "algorithm" that might provide something or maybe extending Boost intrusive rbtree..
I don't see anything from the algorithm either: https://www.boost.org/doc/libs/1_55_0/doc/html/boost/intrusive/rbtree_algorithms.html
BTW, https://tinloaf.github.io/ygg
This one seems to have that needed feature.
You can be notified when stuff happens. Want to do something to the tree nodes every time a tree rotation happens? Need to know whenever two nodes swap places in the tree? Look no further. Ygg will call methods specified by you on several occasions. In fact, that's how the Interval Tree (the augmented tree version from Cormen et al.) is implemented.
Anyone has experiences with this package?
I don’t, but saw it earlier and it does look interesting.
The boost one would let you implement what you need basically the same way the kernel one does though, since the node is your own type you can add whatever you like to it, and the node provides all the usual options to implement traversals. It would be more manual than is normal for c++, but it would work.
Also noticed in a completely roundabout way from a dragonfly bsd list of all things that OpenBSD’s tree.h has augment support in their RB tree: https://github.com/openbsd/src/blob/master/sys/sys/tree.h
Note that even though this uses function pointers, vaguely recent compilers will inline them because they’re assigned once, don’t escape and never change.
Also noticed in a completely roundabout way from a dragonfly bsd list of all things that OpenBSD’s tree.h has augment support in their RB tree: https://github.com/openbsd/src/blob/master/sys/sys/tree.h
Nice! Yes this type of callback support for "internal" tree rebalancing is what I need.
Looking at Yggdrasill, it does provide exactly the support I need here: https://github.com/tinloaf/ygg/blob/master/src/rbtree.hpp#L125
I will try both. Ygg is C++17 but it is header only so backporting that to C++11 won't be too difficult.
The boost one would let you implement what you need basically the same way the kernel one does though, since the node is your own type you can add whatever you like to it, and the node provides all the usual options to implement traversals. It would be more manual than is normal for c++, but it would work.
Yes. It is just that this doesn't open up the rebalancing protocol for me to program to... As least from my cursory look.
https://github.com/mateidavid/interval-tree says
This package does not work with the stock Boost Intrusive library. A separate package provides a hook-enabled version of Boost Intrusive.
So maybe there is a version of Boost that allows this.
Interface-wide, Yggrasil appears to be pretty promising. Its augmented red black tree support is pretty comprehensive, and this should allow us to port planner search trees. One tricky part of this is it doesn't directly allow a use-define node to be added to multiple search trees (e.g., a node belongs to both schedule point search tree and mintime resource search tree) but I think I can work around this limitation. Time for prototyping further directly in the planner.
FYI -- I now have a prototype of planner substrate with Ygg's augmented Rbtree!
It seems the basics of both scheduled point search tree and minimum time resource tree work well with this library, but there are a handful of test cases that are currently failing.
Given that the failure only occurs when you add many spans (1000 spans) into a multi-planner, I will need to do some more debugging to determine whether this library really is workable or not correctness-wide -- that is, before even starting to look at its performance. https://github.com/flux-framework/flux-sched/blob/master/resource/planner/test/planner_test02.cpp#L465
Good news.
I was able to reduce the problem to a small 16-span case and was able to get to the bottom of this using some of the debug support within Ygg library. The problem was just due to slightly augment interface differences between LK's rbtree and this.
Now my port passes all of the test cases!
So this concludes my feasibility.
With this, I will start to break this problem down to a few manageable tasks and post PRs like planner refactoring so that we can also try other implementation easily if needed etc.
Wow. My initial performance tests show Yggrasil actually outperforms LK's augmented RBtree by ~15%!
It seems the only remaining hurdle is to backport Yggrasil's RBtree and its dependencies to C++14.
Performance-wise, I'm concluding that Yggrasil's augmented RBTree is equivalent to LK's RBtree. The differences seem pretty much within noise in my docker on Mac enviornment.
ahn1@docker-desktop:/usr/src/flux-sched/resource/planner/test$ ./planner_perf
1..4
ok 1 - new with (B(0):D(9223372036854775807):R_hardware-thread(4000000))
ok 2 - add 4 million spans
Elapse Time (stress_4spans_overlap) 12.3277 seconds
ok 3 - planner_avail_during 40 million times
Elapse Time (query_time_4spans_overlap) 3.17982 seconds
ok 4 - planner_avail_time 40 million times
Elapse Time (query_resource_4spans_overlap) 10.8681 seconds
LK RBTree baseline
ahn1@docker-desktop:/usr/src/resource/planner/test$ ./planner_perf
1..4
ok 1 - new with (B(0):D(9223372036854775807):R_hardware-thread(4000000))
ok 2 - add 4 million spans
Elapse Time (stress_4spans_overlap) 13.1731 seconds
ok 3 - planner_avail_during 40 million times
Elapse Time (query_time_4spans_overlap) 3.10092 seconds
ok 4 - planner_avail_time 40 million times
Elapse Time (query_resource_4spans_overlap) 10.7401 seconds
Test code: planner_perf.cpp.txt
FYI -- making pretty good progress towards a PR: https://github.com/dongahn/flux-sched/tree/planner-c++, in case you are wondering where I am with this.
PR #846 has just been posted for review.
And finally PR #849 has been posted, which replaces GNU readline with editline library.
Great stuff @dongahn. The Yggdrasill PR looks really nice too FWIW.
Thanks @trws for looking at various alternatives with me and reviewed the PR!
I'm pretty happy with Yggdrasil in general.
One thing is that it uses some C++17 features on non-critical code (e.g. Tree Options) which requires various surgical operations to adapt it to Fluxion. The other thing is it is a bit not clean to separate only a single search tree but I believe that's just its design.
If other projects are using C++17, I recommend to look at Yggrasil's data structure. It is a high quality implementation IMHO.
@dongahn expressed interest in licensing the library-component of the resource matching service as LGPL. AFAIK, this would require finding an LGPL-compatible version of the red-black tree currently in use.
Low priority but something to keep in mind for the future.