Closed andriewski closed 2 years ago
Desired implementation of @Singular @Builder with to builder would be
@Value
public class User {
int id;
List<String> roles;
public static UserBuilder builder() {
return new UserBuilder();
}
public UserBuilder toBuilder() {
return new UserBuilder().id(this.id)
.roles(this.roles == null
? java.util.Collections.emptyList()
: new ArrayList<>(this.roles));
}
public static class UserBuilder {
private int id;
private List<String> roles;
UserBuilder() {
}
public UserBuilder id(int id) {
this.id = id;
return this;
}
public UserBuilder role(String role) {
if (this.roles == null)
this.roles = new ArrayList<>();
this.roles.add(role);
return this;
}
public UserBuilder roles(List<String> roles) {
if (this.roles == null) {
this.roles = roles;
} else {
this.roles.addAll(roles);
}
return this;
}
public UserBuilder clearRoles() {
if (this.roles != null) {
this.roles = null;
}
return this;
}
public User build() {
List<String> roles;
switch (this.roles == null
? 0
: this.roles.size()) {
case 0:
roles = java.util.Collections.emptyList();
break;
case 1:
roles = java.util.Collections.singletonList(this.roles.get(0));
break;
default:
roles = java.util.Collections.unmodifiableList(new ArrayList<>(this.roles));
}
return new User(id, roles);
}
public String toString() {
return "User.UserBuilder(id=" + this.id + ", roles=" + this.roles + ")";
}
}
}
So, that would pass
public static void main(String[] args) {
// We're creating user and saving it
User user = User.builder()
.id(1)
.role("role1")
.role("role2")
.build();
User updatedUser = user.toBuilder()
.clearRoles()
.roles(List.of("role3"))
.build();
User userWithEmptyCollection = updatedUser.toBuilder()
.clearRoles()
.build();
User userWithNewSingleRole = user.toBuilder()
.clearRoles()
.role("role1")
.build();
User userWithAddedRole = user.toBuilder()
.role("role3")
.build();
User userWithAddedRoles = user.toBuilder()
.roles(List.of("role3", "role4"))
.build();
isTrue(updatedUser.getRoles().equals(List.of("role3")));
isTrue(userWithEmptyCollection.getRoles().isEmpty());
isTrue(userWithNewSingleRole.getRoles().equals(List.of("role1")));
isTrue(userWithAddedRole.getRoles().equals(List.of("role1", "role2", "role3")));
isTrue(userWithAddedRoles.getRoles().equals(List.of("role1", "role2", "role3", "role4")));
}
I don't think either of your suggestions is desirable or smart. Hence, denied - but read on - I may have misjudged your request or the performance impact. JMH results and profiler reports may change my mind (in which case, comment those and the issue will be reopened if warranted).
If I understand you correctly, you want 2 changes. One of them is straight up incorrect, one of them is a purported performance improvement, but I think it'll actually be a performance decrement. Because computers are more complex than claiming 'but it is O(n)
!'.
.addAll
Main idea is cool and it work perfectly fine, when init collection is not empty, otherwise -> System.arrayCopy // I don't understand why it's just not repalce it if this.collection is empty
If I understand you correctly, you think that if the current 'status' of the singular list in the builder is that it is empty (i.e. the private ArrayList<String> roles;
field in the builder instance is null
or an empty list), that we should not create a list and .addAll(argument)
, instead, we should just do roles = argument
.
Anyway it will be ummutable after .build() method is invoked
This is incorrect. It will be unmodifiable. That is not the same as immutable. The list ref we end up putting in the built object cannot be used to modify the list, but the list is not (neccessarily) immutable: It might be changed by other code. Here is an example of that:
List<String> list = new ArrayList<String>();
list.add("Role1");
someBuilder.roles(list);
User user = someBuilder.build();
System.out.println(user.getRoles().size()); // prints '1'
list.add("Role2");
System.out.println(user.getRoles().size()); // prints '2'
A different example of why what you suggest wouldn't work:
(NOTE FOR CONTEXT: List.of(a, b)
makes immutable lists!)
List<String> list = List.of("Role1", "Role2");
someBuilder.roles(list);
someBuilder.build();
Currently we 'inefficiently' make a copy of this already immutable list, and then make the copy unmodifiable to boot, during the build()
call. So now we have 2 identical lists on the heap, both immutable (an unmodifiable ref to a list for which no other refs exists is immutable), and that is indeed, as you noticed, inefficient. However, imagine you wrote this code instead:
List<String> list = List.of("Role1", "Role2");
someBuilder.roles(list);
someBuilder.role("Role3");
someBuilder.build();
Had we gone with your suggestion, we'd now get an exception on the .role("Role3")
line - after all, the builder's roles
field is pointing at that List.of()
result, which is immutable. We could write code that tries to add, catches the exception and creates a complete arraylist instead in the name of efficiency, but now we get into the tradeoff game, see below.
.clear()
It's better to set just null instead of clearing it. Clearing is a bit expencive opeartion. It takes O(N)
This is effectively incorrect. clear()
is not expensive, and is effectively O(1)
. clearing is often in fact less expensive - we already have a list object, we might as well keep it around. The reason it's effectively O(1) has to do with CPU caches. We already have an array and it is likely already in-cache. A tight loop that stays in-page is effectively O(1); it is exceedingly unlikely, as in, 1 in many millions, that a user of a builder would first add thousands of entries via the singular builder, then clear() away all that work, and then start over. What's more likely is that something pre-initializes the builder with a handful of initial items, and some sidetrack code wants to undo that. In which case we're clearing less than a page's worth of references which is instant, or at least, likely faster if you measure the load on the entire system (i.e. also code running in other places), as this is less likely to cause page swaps.
You're making performance suggestions whilst you're just spouting received knowledge about what you think is performant and what isn't. Performance is incredibly complicated. If you think you know about performance (and the fact that you had the audacity to write this feature request certainly suggest you feel pretty confident), you.. don't, and you're a danger to your codebases if you do. Don't feel bad - just about every programmer I know, including myself, has gone through this phase. In the distant past, it wasn't even a mistake - just eyeballing code and making judgements about performance was pretty reliable with sufficient experience and knowledge.
But this is simply no longer the case. Between hotspot, configurable garbage collection schemes, and CPUs that do a ton of optimization work on the fly (pipelines and the like), nobody can eyeball code and make performance guesstimates anymore. The JVM engineers, who are probably the most likely of all programmers on the entire planet to make performance calls just with eyeballs, have said the same thing at conferences and the like: You CANNOT eyeball code and make such casual performance observations. You will have to write extensive tests, using e.g. Java Microbenchmark Harness, and a lot of profiler reports.
Performance 'suggestions' are categorically denied without such things (JMH results and profiler reports).
You can come up with a ton of really cheap little bookkeepy actions and checks that would markedly improve performance in exotic situations. We could use Collections.singletonList
first, and then expand later, for example. Or we could check if the incoming list is itself immutable (unfortunately, java has made this complicated, but even so, there are ways), use it, and then when adding, check again and make a new arraylist after all in such cases.
Forget the burden of maintaining a considerably more complex code generation path for a moment: That's death by a thousand cuts. Those nearly free cheap little bookkeepy actions are applied to every builder usage in the entire ecosystem, whereas the performance benefit of us not making a pointless second version of an already immutable list is only a gain in a vanishingly small amount of cases. In those cases, it would be better, no doubt, but we're now putting on the scales 2 different options:
Which one is better? Well, actually, having the baker and the butcher pay the euro. Or is it? Perhaps the intent was that nobody'd feel it, so smearing the cost out across the village, even though that costs more, is more desirable. Or not? Who knows - the builder codegen doesn't know precisely what your use case ends up being, and doesn't want to provide you with a hopelessly huge wall of finicky little config options to let you tell it - because 99% (literally!) of the time none of it matters, this isn't the performance-crucial codepath, or it is but you're only adding a handful of things so it's all effectively instant due to being all-in-one-cache-page. Hence, if you have specific performance needs, you're in the 0.1% case, and therefore it is no longer boilerplate, and therefore lombok's answer is an easy: Just write it out longhand, if you have such exacting needs.
@rzwitserloot thank you for your reply
First of all, I would like to appoligize for not professional approach, of course I should've sent you benchmark results about any piece of code (crap).
Warmup: 10 iterations, 10 s each Measurement: 10 iterations, 10 s each Timeout: 10 min per iteration It took ~3h of testing
userWithXXXPhones
.clearPhones()
.phone("phone1")
.phone("phone2")
.build();
Custom means - just simple set null instead of .clear() invocation Original means - original implementaion
customClearWith0Objects thrpt 50 17087111.015 ± 222870.756 ops/s
customClearWith100Objects thrpt 50 17066783.642 ± 253036.898 ops/s
customClearWith10Objects thrpt 50 17009708.513 ± 280707.969 ops/s
customClearWith30Objects thrpt 50 16957658.087 ± 210687.424 ops/s
customClearWith3Objects thrpt 50 16553479.512 ± 399414.193 ops/s
originalClearWith0Objects thrpt 50 19372425.856 ± 63110.387 ops/s
originalClearWith100Objects thrpt 50 12275440.653 ± 313264.782 ops/s
originalClearWith10Objects thrpt 50 19336396.755 ± 215891.888 ops/s
originalClearWith30Objects thrpt 50 17188988.843 ± 180287.831 ops/s
originalClearWith3Objects thrpt 50 19860686.659 ± 474809.643 ops/s
I tested with GC profile, but I'm not sure you are interested in it for now.
This result show, that for really common needs - .clear() is surelly enough and array allocation is quite costy operation, so .clear() for perfectly fine up to 30 objects and considering it less affect for GC cycle for allocation extra array of objects - that's really great
But, for more than 30 objects, I got only test for 100 objects - It shows linear complexity - kinda O(N), so we can make a conclusion, that it really depends on list size to make it clearance.
Of course as you said, If we have some specific situation - we should implement it ourselves.
Honestly, the reason why I started spilling with my own naive solutions (crap) was that behaviour that @Singular annotation makes is quite tricky
If we have just this peace of code
User user = findById(1);
List<String> updatedPhones = findPhones();
user.toBuilder()
.phones(phones)
.build();
It behaves differently if we have @Singular annotation or we don't have it
But we can not be sure what will happen, unless we check our User class implementation, so in my humble opition, that's a bit tricky, but as I said @Singular annotation provides really cool features for any collection
That would be really great if you make maybe in the future, something that would work less triky
Something like
@Bulder.Default
List<String> list = Collections.emptyList();
But It would work like if list is null - that would init it with emptyList(), since for now
// list would be null
User.builder()
.list(null)
.build()
// list would be empty list
User.builder()
.build()
It could be something like
@Unmodifiable
List<String> users
And it would do that cool .build() part where we add emptyList or make it with Collections.unmodifiableList()
Of course you are right with that statement, that it's not immutable, since if we have link to it - we can modify it, so I understand why you made it with .addAll() method, but from my perspective as a Java developer - I guess that would be really cool if you added something that it works like simple collection setter for builder and make it unmodifiable afterwards, since nobody would modify the collection's link, since it just doesn't make any sence
Anyway, I would like to say thank you for your contribution in Java community, since each and everyone uses this library for their projects and to put it mildly Java would be less user friendly without it
First of all, I would like to appoligize for not professional approach, of course I should've sent you benchmark results about any piece of code (crap).
Hey, we haven't posted that rule anywhere, so there is no need whatsoever to apologise.
Thanks for that performance report! It's good to know that the breakeven point is somewhere in the 20-100 items mark. I think we can stick with .clear()
given that I bet 'sub 20 items' is far more common, but, it is a bit painful that clearing thousands of items is pricey.
However, JMH reports do not report on future impact. Leaving a huge arraylist around for the garbage collector to chew through is not free, but a JMH report is never going to tell you about that cost. It'll be amortized over a very long runtime, across all code that JVM is running. But it's still a cost, which gives .clear()
some extra bonus points, carrying it across the line here I think.
It behaves differently if we have https://github.com/Singular annotation or we don't have it
Yes, I see that. This is unfortunate. I'm not quite sure how we can address this; I guess in retrospect possibly .phones
should let you set the list verbatim, and perhaps .phone
(the singular) isn't just varargs, it also gets overloaded with a Collection<? extends T>
taking list variant that does what .phones
does now.
But, changing all that now would practically require releasing lombok v2
– that breaks every usage of @Singular
out there.
But It would work like if list is null - that would init it with emptyList(), since for now
I'm not sure I follow here. Are you saying: You'd like some feature where if some code explicitly calls .list(null)
, that lombok will replace that during build()
with an empty list? Denied, categorically. That 'view' of what null
means (a placeholder for default value / a sentinel for an empty value) is somewhat common in the java ecosystem but lombok espouses the more logical take on what null
is supposed to mean in java code: A real 'value', and it means 'unset, not applicable, missing, or otherwise unknown'. In other words, the fact that any attempt to interact with what it is referencing should be throwing an NPE.
Also, @Unmodifiable
as a term means more or less the same thing as final
, so it's a bit odd that annotating with @Unmodifiable
means: "The thing that this variable points at is intended to be unmodifiable; not this variable itself". Also, "that means null
is replaced with an immutable empty list ref" is a total non-sequitur. What does that have to do with unmodifiability? a null pointer is quite unmodifiable, after all.
It would, however, be kinda neat if you can 'unset' in a builder - i.e. explicitly tell the system to go back to whatever is default. However, in general, the job of re/un-setting anything in a builder is a sub-1% use case as is. Few usages of builder ever use any overwrite/unsetting behaviour. It's most common, granted, in .toBuilder()
scenarios.
In general toBuilder() is where the real mess occurs: There you may want to say: "Go back to default value" (as per @Builder.Default
), or indeed where the fact that .phones()
on a @Singular
list adds more phones, whereas without @Singular
it would replace. On a fresh builder that is mostly irrelevant, on a .toBuilder()
approach that's a different story altogether.
I'm not quite sure what an update would look like, though - any annotations are likely just confusing (e.g. @Unmodifiable
, I certainly have no idea what that's supposed to do here), and in any case, just about everything that even gets anywhere near simple and understandable would at this point no longer be backwards compatible. What's left is a forest of settings and exotic annotations, and that's not worth the maintenance burden, I'm afraid.
I'll leave this issue closed, but I'm taking under advisement that builder has some design issues especially when .toBuilder()
is involved. I'll keep an eye out - if ever I see a clean solution, I'll be on it :)
I'd suggest to modify current @Singular annotation behaviour or just add special flag to maintan that behaviour.
The problem in current implementation is if you find Object with field marked as @Singular and you just want to replace items you will need to user .clearItems() and then .items(), but generated code is not really efficient. Basing clear() method for list iterates over whole collection and set null to each element. Then .items() method iterates over one collection to add and add them to inner collection, so it's O(N) + O(N) complexity for both cases.
In fact for for .items() it uses System.arrayCopy() so it works almost perfectly fine, but still just replacing it should be better, since you just need to change pointer for collection.
So we just don't use that annotation for our projects, even though that one has really cool features.
There is just base code snippet, what I ment: