replikativ / hitchhiker-tree

Functional, persistent, off-heap, high performance data structure
Eclipse Public License 1.0
44 stars 19 forks source link

perfs low hanging fruits #3

Closed mpenet closed 5 years ago

mpenet commented 5 years ago

It's a work in progress but I found a few areas where things can be made faster and more gc friendly.

I didn't measure the overall impact (yet), but I did bench the changes separately to make sure they are indeed improvements.

I am still wrapping my head around the implementation to see if we can do more extremes changes (other than changing the surface/impl of some operation).

mpenet commented 5 years ago

Running some benchmarks it seems to not bring that much (or anything) or maybe I failed to understand how to run/read the benchmarks.

In isolation the changes are effective but in the grand scheme of things: not really. I guess it's time to dig into the logic of it all now and use a profiler instead of shooting blind.

whilo commented 5 years ago

I think both David and me have already spent many hours with profilers on the codebase. Any speed-up is welcome, of course!

mpenet commented 5 years ago

ok, getting somewhere. I get improvements. I am relying on the current benchmark suite but I am tempted to make a criterium or JMH based thing to get more precise results as it seems I get sometimes wildly different results for the same code.

If you could give a look to the changes I'd appreciate. I will try a bit more of these kind of optimizations then I ll re-organize this mess of commits in something you could merge (hopefully).

mpenet commented 5 years ago

related: https://github.com/replikativ/hasch/pull/6

whilo commented 5 years ago

Looks like nice work so far. I would love to get a sense for the speed-ups you are seeing.

mpenet commented 5 years ago

It varies quite a bit depending on the runs, but there seem to be always gains. In some cases this can be ~30% in other in the single digits. I am only relying on the current bench task at the moment. I suspect this branch is also more gc friendly overall. Also keep in mind I am running these benchmarks on my laptop, so I am not trusting the accuracy of these results as much as I would if I was running them in an isolated environment. I tried to do my best to limit the external "noise" but you never know.

I put 2 spreadsheets on google docs to compare:

The scales should be identical between the 2 spreadsheets graphs, opening both in separate tabs and going back and forth gives a somewhat decent view of the gains.

master is modified to use same bench task and same jvm args (actual branch is here https://github.com/mpenet/hitchhiker-tree/tree/master-p)

This is done via: lein bench perf_diff_experiment -b 10 -r 1 -- -b 20 -r 1 -- -b 40 -r 1 -- -b 80 -r 1 -- -b 160 -r 1 -- -b 320 -r 1 -- -b 640 -r 1

mpenet commented 5 years ago

Just noticed I have failures in hitchhiker.tree.messaging-test :(

mpenet commented 5 years ago

fixed

mpenet commented 5 years ago

more gains with last-key caching on latest commit. I added the link to the spreadsheet in the original comment with the other two.

I guess I could do this in a more functional way now that I think of it. I can give it a try tomorrow.

whilo commented 5 years ago

Very nice work. I will have not time to look at it until Monday, but will give feedback afterwards.

whilo commented 5 years ago

@mpenet How do you get along with dual backend (core.async vs. sync) btw.? We are worried whether this is pulling off potential contributors. On the other hand it allows to support JavaScript, which is kind of a big benefit.

mpenet commented 5 years ago

I like it. I usually worry about misuse or overuse of core.async but in that case it fits nicely. It has to be something optional tho, so having both sync/async interfaces makes sense imho, even if that means just building the sync api on top of the async one. There's one case where I have yet to make my mind is with superv.async in datahike, but it's probably because I am just discovering it. I have decent experience with erlang/otp so I guess this will click but it might be off-putting for potential contributors. I usually see/did error propagation with "control" channels (could be even simple promise-chans) between contexts, that can signal errors that way but I suspect that superv.async prolly does something similar + some plumbing for dealing with exceptions. But as I said I am not really familiar with its internals (yet).

mpenet commented 5 years ago

I updated the spreadsheet links with the last modifications.

mpenet commented 5 years ago

While fixing the tests for konserve I was thinking maybe it would be nicer not to have to do this kind of things (nilify some fields like storage-addr/last-key-cache) at "storage" level and have an intermediary serializable representation of various nodes we can generate from .core , maybe something like a tagged hash-map.

... or we can just have a function (in core) that sets these fields to "nil" and be done with it. I find it slightly ugly/inefficient to have to store nil fields tho.

whilo commented 5 years ago

There's one case where I have yet to make my mind is with superv.async in datahike, but it's probably because I am just discovering it. I have decent experience with erlang/otp so I guess this will click but it might be off-putting for potential contributors.

I have decided not to use superv.async in the end, because the overhead for the book-keeping seemed to add up in my experiments (which probably is not valid anymore with all the optimizations :) ) and we do not really have concurrency here, so I just use go-try to propagate IO errors with stack traces.

I usually see/did error propagation with "control" channels (could be even simple promise-chans) between contexts, that can signal errors that way but I suspect that superv.async prolly does something similar + some plumbing for dealing with exceptions. But as I said I am not really familiar with its internals (yet).

Yes. It actually also terminates concurrent go-routines and detects errors that are not taken care off (not taken off a channel). I would say it is a fairly good take to make core.async robust in an Erlang fashion, but maybe also a bit of a stretch. I would really love to hear your feedback, because I have only done tutorial code about Erlang and studied its error handling.

whilo commented 5 years ago

While fixing the tests for konserve I was thinking maybe it would be nicer not to have to do this kind of things (nilify some fields like storage-addr/last-key-cache) at "storage" level and have an intermediary serializable representation of various nodes we can generate from .core , maybe something like a tagged hash-map.

* a new protocol + functions for converting nodes to something more transparent than records -> a simple tagged hashmap `{::hh-tree.node-type :index-node + <rest of the fields> }`

* and from handlers in storage we just call these without having to worry about extra fields like stuff related to caching or other internal bits.

... or we can just have a function (in core) that sets these fields to "nil" and be done with it. I find it slightly ugly/inefficient to have to store nil fields tho.

Yes, totally agree. Lets use a protocol for that, I think. The nils are there to keep the record representation intact, which is kind of stupid. I think we can do this after this PR.

mpenet commented 5 years ago

Imho let it crash is quite often mentioned with incorrect interpretations, I found that in practice you end up being extra defensive against failures and "let it crash" only as last resort, retries/restarts are far from free and supervisors not magical. Sure, you rarely use exceptions in erlang but you shield yourself from errors with extensive pattern matching, dialyzer and other means. I have never found the need to design that way systems with clojure/java. Also in erlang processes are quite cheap, contained, with a memory model that allows to design last resort failure mode that way but it's not very well suited for the jvm in general imho.

I have used core.async in a few projects and I tend to convey errors as values (tagged or raw exceptions instances) simply, and let the user deal with it as they wish.

You can for instance create channels with a xform that will turn messages into something like [:ok <value>] [:error value] and reuse that xform everywhere: (async/chan ... (map as-result)), then it's a simple pattern match with case away in your code, which is quite decent for a very small cost. Another way to do it is to return (and potentially take as input) a "control" chan that will contain just errors and values that triggered their occurrence they you can handle these downstream with alt!(!) instead of case as before, but I prefer the other approach overall.

I try also to avoid using return channel from go blocks directly and prefer to create function that can take and return the input channel(s) instead, this way the user can specify type of chan, potential xform(s) at that level, buffering etc. This also allows some optimizations like using promise-chan instead of a full blown chan as return value, which opens the door to multiple takers without lots of copying etc etc. In short I use go blocks to "coordinate/manage" operations on channels but I prefer to control what's returned myself instead of deferring to go blocks channel return.

But I digress, this is another broad subject we can talk about on slack maybe.

mpenet commented 5 years ago

Weird, I will try to have a look as well.

mpenet commented 5 years ago

It seems it could be related to https://dev.clojure.org/jira/browse/CRRBV-15, simply bumping the dependencies could fix it. I will give it a try and report back here.

My bad I didn't test the cljs build, I should have really

mpenet commented 5 years ago

seems ok now:

$ lein cljsbuild once
Compiling ClojureScript...
WARNING: pos-int? already refers to: #'clojure.core/pos-int? in namespace: taoensso.encore, being replaced by: #'taoensso.encore/pos-int?
WARNING: bytes? already refers to: #'clojure.core/bytes? in namespace: taoensso.encore, being replaced by: #'taoensso.encore/bytes?
Compiling ["resources/public/js/core.js"] from ["src"]...
Successfully compiled ["resources/public/js/core.js"] in 3.239 seconds.
Compiling ["target/test.js"] from ["src" "test"]...
Successfully compiled ["target/test.js"] in 4.923 seconds.
whilo commented 5 years ago

This is annoying, somehow node test_node.js does not run the tests anymore... Investigating. Ideally they should be invoked with the clj test, but we have not had the time yet to fix that.

whilo commented 5 years ago

Ok, it is the <-cache macro that somehow expands wrong in cljs. I have made some progress, but it is not working yet.

mpenet commented 5 years ago

Weird. We can probably just make it a function, we actually have little reason to have it as a macro.

On Sun, Feb 17, 2019, 10:37 Christian Weilbach <notifications@github.com wrote:

Ok, it is the <-cache macro that somehow expands wrong in cljs. I have made some progress, but it is not working yet.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/replikativ/hitchhiker-tree/pull/3#issuecomment-464434511, or mute the thread https://github.com/notifications/unsubscribe-auth/AAGflgwiagpv-LnTv3CETobp4_yMTv-5ks5vOSLlgaJpZM4aSqJA .

whilo commented 5 years ago

Ok, that makes sense.

whilo commented 5 years ago

Ok, I got it figured out and checked the tests in the REPL, but the test runner for node is not working anymore.

whilo commented 5 years ago

A problem besides the macro expansion was that identical? does not work in cljs for keywords, because they are not interned, it seems. I am pushing this fix now and you can take a look at it. I would appreciate if we can fix the node cljs runner. If you can do it, I would appreciate it, otherwise I will try to do so towards the end of the week. Thanks a lot for putting the effort in, I am loving it! I hope we can press on from here and make the codebase both more accessible and clean as well as more useful for datahike et al.

mpenet commented 5 years ago

:champagne:

Interesting: https://cljs.github.io/api/cljs.core/keyword-identicalQMARK I never knew about this difference.

About node cljs runner I will try, but I have a busy (short) week unfortunately.

whilo commented 5 years ago

Ok. I am also fairly busy this week.