Closed techsy730 closed 3 years ago
Oh I didn't know. No, I would prefer to follow the platform behavior... since you're at it, would you fix it? Thanks!
Sure. I can fix it.
But I was thinking it would be nice to have a version of of(...)
that returns an unmodifiable List/Set. That way if the size is 0 or 1, the empty or singleton version can be returned, and we can potentially swap implementations based on the number of elements.
Should that live under "List" or "Lists"? And should that be a different PR?
I have added indeed to type-specific List and Set an .of() method, presently delegating to the array-base and hash-based implementations. Is the Java version specified as immutable?
Yes, the versions of "of" that lives in the root collection interfaces are specified as immutable. https://docs.oracle.com/javase/9/docs/api/java/util/Set.html#of--
I think having mutable versions of these in the mutable subclasses is a fine API though. But we may want an immutable version too that behaves like the platform's version.
Well, then I'd say that's very reasonable to have an implementation .of() method that just return the implementation, and an interface .of method that returns an optimized potentially immutable result. I'm not sure I want to guarantee immutability because that will make the implementation slower. Unless we subclass OpenHashSet, say, and override the mutation methods.
What do you think?
OMG they have specialized instances up to 10 elements. Just to avoid allocation of an array, I guess...
I think guaranteeing immutability of the result is good. This guarantees that you don't have to do defensive copies / wrapping (trading a bit of speed for good design paradigms).
OMG they have specialized instances up to 10 elements. Just to avoid allocation of an array, I guess...
And to allow for specialized implementations I guess. Maybe up to x elements it makes sense to implement contains
etc. via if
slides.
Actually, with ints, you could implement such immutable sets / maps with switch
statements. These should be stupid fast.
switch statements? Can you elaborate on that?
switch statements? Can you elaborate on that?
oops my bad, switch branches of course have to be compile time constants 🤦 shouldn't write things without thinking.
Ok, I have cleaned up the root interface of
methods. Also provided some overloads to avoid temporary array creation (but only up to 3).
Right now they do return immutable lists/sets, but besides the singleton and empty case they are mere normal implementations wrapped in the type specific unmodifiable
. A little bit of overhead yes, but any JVM worth anything should be able to optimize that away.
This seems to be ready to merge for me. Comments?
This seems to be ready to merge for me. Comments?
I added some, only minor stuff / naming convention.
I'm not so sure about the value of the mutable of
methods with arguments. @techsy730 Do you have a particular use case in mind for them? IMO a simple .create()
or .of()
is sufficient (maybe rather give those methods parameters like expectedSize
etc., similar to Guava; See for example HashBasedTable's three create
variants)
create
for mutable return types and of
for immutable sounds like a reasonable change to me.
As for why we have argument overloads in addition to the varargs one. It is to avoid creating an array for cases that don't need an array. For example, if we didn't have no-arg overload, then List.of() and OpenHashSet.create() called with no arguments would create a new empty array each call that is going to be thrown away.
Mmmh. No, I like .of() everywhere—my idea was that .of() in an implementation gives you an object of that implementation, .of() in the interface gives you an immutable implementation.
I'm thinking of splitting the methods of OpenHashSet, adding an intermediate ImmutableOpenHashSet class that might be useful for these implementations. It's just a matter of moving the mutation methods and probably copy-and-paste some code for the iterator.
BTW, interesting comment in Guava:
@SafeVarargs // For Eclipse. For internal javac we have disabled this pointless type of warning.
Mmmh. No, I like .of() everywhere—my idea was that .of() in an implementation gives you an object of that implementation, .of() in the interface gives you an immutable implementation.
I see. Fair enough. I'm just afraid that this might cause confusion because of name clashing.
I'm thinking of splitting the methods of OpenHashSet, adding an intermediate ImmutableOpenHashSet class that might be useful for these implementations. It's just a matter of moving the mutation methods and probably copy-and-paste some code for the iterator.
If we are doing this: Maybe aggregate the methods in a hidden "AbstractOpenHashSet" and add a public "ImmutableOpenHashSet" subclass as sibling of "OpenHashSet". Intermediate might be dangerous since then if (set instanceof ImmutableOpenHashSet)
doesn't really do what it should. Additionally, the public immutables could have a shared ImmutableCollection
marker interface, so if you have something generic like IntCollection
you could check instanceof ImmutableCollection
.
As for why we have argument overloads in addition to the varargs one. It is to avoid creating an array for cases that don't need an array. For example, if we didn't have no-arg overload, then List.of() and OpenHashSet.create() called with no arguments would create a new empty array each call that is going to be thrown away.
I'm aware of that, I think you misunderstood me. My comment was only about the naming, not that there are "empty" factory methods.
If we are doing this: Maybe aggregate the methods in a hidden "AbstractOpenHashSet" and add a public "ImmutableOpenHashSet" subclass as sibling of "OpenHashSet". Intermediate might be dangerous since then
if (set instanceof ImmutableOpenHashSet)
doesn't really do what it should. Additionally, the public immutables could have a sharedImmutableCollection
marker interface, so if you have something generic likeIntCollection
you could checkinstanceof ImmutableCollection
.
Yes, I was unclear but that was the idea. Also because I'd like to cache the hash code, as in Guava, so that OpenHahSets of immutable sets would have much faster access. That could be useful in a number of situations.
But I think all this should follow a merge with #120 , or we will get in a mess of conflicts.
Sounds good. And I need to also do a similar thing to BigList, BigArrayBigList, and OpenHashBigSet anyways (We don't have a BigSet interface, so I am wondering where we are going to put the immutable big set versions).
Well, a big set is just a big implementation of a set. Except for the unfortunate size() method returning an int, there's no method in Set that doesn't apply to big sets. So an OpenHashBitSet.of() method might be useful in weird circumstances, but definitely we do not need methods returning an immutable big set.
When are positioned now with this? Should I merge or there is something else you want to add?
Ideally I would like to add such methods to BigList and BigArrayBigList as well.
I do need to clean up the @SafeVarargs stuff and documentation, bring it into line with Guava's policy. Or perhaps not use the array given as the backing array (clone it first, and ensure it is Object[] exactly for the object based versions), and use the array given as is in some wrap
method (which some of them do, this would apply this convention consistently)
Yes, I would avoid copying as much as possible. What is Guava's policy? I see comments about "useless annotation".
OK, there are now type-specific ImmutableList types (just a backing array and access method). I had to do a little bit of code duplication, but, still, very little. I do not think it makes much sense to have an ImmutableBigList type.
I'm merging this and replace in List.of() the new immutable stuff. Apart for clarifying the @SafeVarargs stuff everything looks fine to me.
Clean up the of methods()
of
methods in the rootList
andSet
interfaces return immutable collections (like the platform does)Because of this validation, it is now worth it to have overloads of 0 and 1 element, which skip the temporary array and validation (0 and 1 element sets can never have duplication).
In the spirit of ArraySet's "just trust me" API of working raw arrays unvalidated, an it has an offshoot
ofUnchecked
that takes the elements as is and does no validation, like the array accepting constructors do.