arximboldi / immer

Postmodern immutable and persistent data structures for C++ — value semantics at scale
https://sinusoid.es/immer
Boost Software License 1.0
2.51k stars 184 forks source link

Implementation of map_transient, set_transient #82

Closed philippecp closed 1 year ago

philippecp commented 5 years ago

Any timelines on these?

arximboldi commented 5 years ago

Sadly nope, I am quite busy with client work lately and this is not trivial work (might take one or two full-time weeks in total, since it needs improving how the optimized allocators work--which would also bring improvements to vectors and flex vectors-- and I would like to do all the optimization, testing and fuzzing that I've done for the other structures to ensure it is production quality).

Are you using the library in a commercial project? This feature would be a good candidate for sponsoring the project. We could make an agreement with concrete deadlines and tune it for your performance requirements.

jzrake commented 5 years ago

I'd like to second this request. If there's any way to delegate some of the work, I'm happy to volunteer.

jeaye commented 3 years ago

I'm only using this for an open source Clojure-like language, but map transients would be a huge win. Right now, profiling shows my benchmark program spends over 20% of its time just building maps. (EDIT: Turns out it's faster to copy and mutate std::map)

What would it take for an individual to sponsor you to work on this?

Further edit: making a custom map which is just a vector of pairs cut my benchmark program's time by ~50%. Insertion is O(N ^ 2), but I only have a few keys in each map, so it's faster than using std::map, which is faster than using immer::map (without transients).

arximboldi commented 3 years ago

Yes short maps tend to be faster as vectors, even with std::vector vs std::unordered_map.

I must admit that is really top priority for my next focused time on Immer, I have sadly just been very busy lately... I am aiming to work on this at some point this summer. Of course contributions both in time or sponsorship are more than welcome! If you don't have a company using Immer directly is hard for me to suggest a specific contribution though...

jeaye commented 2 years ago

Hi there! Just checking in here to say I'm still very interested in this particular issue. Would be willing to put a bounty on it, if immer supported such a thing.

arximboldi commented 2 years ago

Drop me an email at juanpe@sinusoid.es, there is some progress :)

jeaye commented 1 year ago

I saw this was done in the latest release. Thank you! We can likely close this and I'll be giving the transients a try soon.

Out of curiosity, do you have any plans for short map support?

arximboldi commented 1 year ago

What do you mean by short map?

arximboldi commented 1 year ago

And thanks for the kind words, glad this is helpful!!

arximboldi commented 1 year ago

Oh, and I see you are using this for Jank, which looks like an amazing project! Are you using Immer for anything else? And Jank, are you using it in production or is it at "hobby" stage? The code looks pretty nice. And I could use actually make use of something like Jank for sure!

jeaye commented 1 year ago

What do you mean by short map?

For smaller maps, a vector of pairs is significantly faster than a hash map (lookup is O(n), and insertions require a fully copy, but they beat the overhead of the hash map). Clojure supports both, seamlessly, as PersistentArrayMap (short map) and PersistentHashMap. When adding to a short map, once there are enough values (16 for Clojure), it switches to a hash map.

This is why, in Clojure, the order of keys in {:a 1 :b 2 :c 3} is the order we write them in, since it's a PersistentArrayMap, but not if there are more than 16 keys.

The relevant Clojure source is here: https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/PersistentArrayMap.java

In my own testing, with jank, the difference between a naive short map (vector of pairs) and immer's maps is very noticeable.

Are you using Immer for anything else? And Jank, are you using it in production or is it at "hobby" stage? Just using Immer for jank right now, though it's a crucial part of jank's runtime. I'm trying to hit parity with Clojure right now, which is what leads me to asking about these short maps.

jank is not production ready, but it's moving quickly. Some of the large challenges, such as JIT compilation and C++ interop have been tackled. Mainly, what remains is implementing the remaining parts of clojure.core so jank can actually be useful.

arximboldi commented 1 year ago

I see, that could be an interesting addition indeed. I guess in the meantime you can do that in Jank by wrapping the Immer map into something else?

jank is not production ready, but it's moving quickly. Some of the large challenges, such as JIT compilation and C++ interop have been tackled. Mainly, what remains is implementing the remaining parts of clojure.core so jank can actually be useful.

Nice! Looking forward to see it evolve. I think this is a great project... having a native and embeddable Clojure with nice C++ interop can be a game changer!

arximboldi commented 1 year ago

Also, while probably not crucial, I'm curious about Jank vs Java Clojure vs ClojureScript performance :smile: Have you got some benchmarks or other data?

jeaye commented 1 year ago

Also, while probably not crucial, I'm curious about Jank vs Java Clojure vs ClojureScript performance smile Have you got some benchmarks or other data?

Nothing to show yet. Expect start up time for AOT compiled builds to be much lower. Runtime speed will likely remain in the same class as Clojure JVM, but with less memory usage.

The last iteration of jank (focused only on AOT and not JIT) did compete well against Clojure JVM, given a ray tracer I wrote, but there's still so much more perf work to be done!

You're most welcome to come join #jank!

arximboldi commented 1 year ago

That sounds already like really good results, considering Clojure runs on the JVM which has been optimized over decades. Very promising!

You're most welcome to come join #jank!

I somehow can't login to that Slack group...

jeaye commented 1 year ago

Following up on this, since I figured you'd be interested. Ended up going down an optimization rabbit hole, implementing Clojure's sequences and seeing how far jank can be pushed: https://jank-lang.org/blog/2023-01-13-optimizing-sequences/

Here's an invite link to the Slack group. Might work better. https://join.slack.com/t/clojurians/shared_invite/zt-1n66n95ra-d_vFlIi3fSfEmvSKOuTjpw

arximboldi commented 1 year ago

Very interesting, thanks for sharing!