vigna / fastutil

fastutil extends the Java™ Collections Framework by providing type-specific maps, sets, lists and queues.
Apache License 2.0
1.76k stars 196 forks source link

Array getter/setter for ArrayMaps #267

Closed TheMode closed 2 years ago

TheMode commented 2 years ago

I am in a situation where I would like to ensure no identical arrays exist across all of my maps (Object2ObjectArrayMap) in order to reduce memory footprint, I however couldn't find a way to retrieve the raw array like we can for list elements.

incaseoftrouble commented 2 years ago

Not sure what you want to achieve.

You want maps with identical backing arrays to share the memory? This of course only makes sense if they are immutable. But, if the backing arrays are identical, the maps of course are identical, too. Why not simply merge the reference to the map? (Also, the maps might be identical if the backing arrays are not! So this might even be a more specific check).

TheMode commented 2 years ago

Multiple maps sharing the same key arrays could still have different values, in which case sharing the maps wouldn't be enough. Sure I could make my own array map implementation and do all my fancy optimizations to then forward my references to fastutil's, but I assume that exposing the getters/setters directly is a better & simpler solution

incaseoftrouble commented 2 years ago

Ah, fair enough!

Its not for me to decide, but I guess that this is way too specific for fastutil. Exposing the arrays leads to many potential for misuse (again, they have to be immutable for this to make sense). There are a lot of subtleties involved due to how hashing works, so even if you have the same key set, the backing arrays could be vastly different (even order of insertion matters).

I'd suggest simply employing reflection for this.

In any case, if memory is really critical, a specific implementation might be much better (have one big key/value hash map which contains all different entries and track a (bit)set for each different map instance, storing which elements are contained in which map, equality / merging of maps can then be checked by comparing the corresponding bitsets).

TheMode commented 2 years ago

I would just prefer to avoid making custom types for non reason other than some micro memory management. I do not believe that this is too specific, as fastutil already expose similar methods (e.g. new LongArrayList().elements() exposing the raw array without any defensive copy).

Object2ObjectArrayMap mentioned in my case does not use any hashing and is pretty simple to understand, I wouldn't mind those getters/setters available to all the other map implementations for those really wanting to. But this suggestion is about array maps.

incaseoftrouble commented 2 years ago

Object2ObjectArrayMap mentioned in my case does not use any hashing and is pretty simple to understand

Oh, right, sorry, I was thinking about open hash maps. I didn't expect array maps to be used (due to their performance).

For ArrayMaps in particular, I can see the value of this change. However, your use case would also require being able to set the reference, right? (so something like .setElements(a) for ArrayList). This might get a bit hairy with the handling of null values / deriving the new size of the list / map.

TheMode commented 2 years ago

Oh, right, sorry, I was thinking about open hash maps. I didn't expect array maps to be used (due to their performance).

Understandable, it does make sense for my very specific use case where values are not often queried, very small (1-10 entries), and with little overhead.

For ArrayMaps in particular, I can see the value of this change. However, your use case would also require being able to set the reference, right? (so something like .setElements(a) for ArrayList). This might get a bit hairy with the handling of null values / deriving the new size of the list / map.

Indeed, those methods would be required:

Behavior wise I believe that those could just be considered "unsafe", or just ensuring that the new array has the same length as the current one. The alternative would be to do as you said and use reflection, I do not like it but perhaps that I am missing something important.

incaseoftrouble commented 2 years ago

I would maybe even prefer setElements(Object[], Object[], int) (so also specifying the size). However, this basically is the same as just calling the wrapping constructor (which would be a bit cleaner in my opinion, and similar to ArrayList). Then you would of course need to update the references to the map instead. Would that work for you (i.e., adding keyElements() and valueElements() to ArrayMap and then "merging" by using the shallow wrapping constructors)?

TheMode commented 2 years ago

Perfectly fine for my use-case as I only need those during initialization, though it would force the allocation of a new map which may not be ideal for everyone. setElements(Object[], Object[], int) + the getters should be able to cover everything (remains to know how problematic the map allocation can be?).

incaseoftrouble commented 2 years ago

A wrapping constructor already exists! The overhead of object allocation is minimal (JVM can easily handle millions-billions of object creation per second nowadays). So only the getters would be required, right?

TheMode commented 2 years ago

Sure it would work all the same and I do not mind either way. Just saying that it wouldn't hurt to have an allocation-free method since the map is already mutable, could also make it easier to use as you wouldn't be forced to store the map in a non-final field or AtomicReference/Wrapper

incaseoftrouble commented 2 years ago

I'm just trying to establish the possibilities, its up to vigna to decide what should be done :-) 1) add getters to array map 2) add getters to array map + add setters to array map and array list (with array + size)