Closed ztstewart closed 3 years ago
Merging #79 into master will increase coverage by
0.07%
. The diff coverage is100%
.
@@ Coverage Diff @@
## master #79 +/- ##
==========================================
+ Coverage 99.05% 99.12% +0.07%
==========================================
Files 4 5 +1
Lines 318 344 +26
==========================================
+ Hits 315 341 +26
Misses 2 2
Partials 1 1
Impacted Files | Coverage Δ | |
---|---|---|
slice_conversion.go | 100% <100%> (ø) |
|
uuid.go | 100% <100%> (ø) |
:arrow_up: |
codec.go | 100% <100%> (ø) |
:arrow_up: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update abfe188...2b7aac2. Read the comment docs.
The CI build passes now; apologies for the spam.
As noted, this change improves the performance of FromString
and similar methods by a decent margin. More importantly, it means that UUIDs can live on the stack when parsed from a string or bytes, saving garbage collection time.
A few things I encountered that tripped me up:
fmt.Errorf
to errors.New
on a line that is uncovered in defaultHWAddrFunc
. My attempt to modify the code to cover it made UUID generation slower, so I've reverted my original change.golang.org/x/tools/cmd/cover
does not work if there are subpackages. I wanted to put the slice logic in internal/unsafe
which would be cleaner, but creating that package fails the build. If we're OK with dropping 1.9 and lower from CI or have a workaround I can move it around again.Thanks for taking a look, Zach
@acln0 @theckman, I believe you two reviewed the last couple of PRs. I'd greatly appreciate it if you could take a look.
Thanks a bunch, Zach
My 2 cents ... resorting to unsafe
(and introducing specifically called out UB and panics) for the sake of shaving a handful of nanoseconds off of generating formatted error messages is a net loss for a general purpose library. The new code is also significantly more complex than before and, in my opinion, requires some intimate knowledge of the internals of Go's memory management to make sense of why anyone would even care.
I would argue that a system where those <20 nanoseconds make a difference should probably be implementing their own UUID type.
HI @dylan-bourque, thanks for reading the PR, and I appreciate the feedback.
My 2 cents ... resorting to unsafe (and introducing specifically called out UB and panics) for the sake of shaving a handful of nanoseconds off of generating formatted error messages is a net loss for a general purpose library.
I would argue the opposite regarding generality: without these optimizations, the library becomes less generally useful. Reducing allocations in the NewFromString
path to 0 means that a UUID can be stack allocated, which means that at scale you can keep the garbage collector happy. Wrapping it into the library itself means everyone who uses it benefits without any changes in the calling code.
As for panics: Panics can crop up inside the library if Go changes how it lays out strings (unlikely, given how proposals related to that are tagged with Go2
) or if the library starts modifying user input in UnmarshalText
. Modifying the input would be a bug, regardless of whether or not this PR is landed.
I'm happy to add tests for functions that return []byte
to guarantee that modifying the output does not cause panics.
The new code is also significantly more complex than before and, in my opinion, requires some intimate knowledge of the internals of Go's memory management to make sense of why anyone would even care.
I think you raise a valid point: We want to keep the bar for contribution as low as possible. That means code that is easy to read and understand, and I agree that's what we should optimize for.
However, I think the changes here don't require much specialized knowledge outside of the unsafe
code, though I've encountered variations on this trick in a few places (including the standard library).
My hope is that the clarifying comments which I have added (I would love your feedback on them!) can help ensure that future contributors understand the motivation for the changes I've made in error logging code. I personally feel that make
and copy
are not very complex, being built in functions, but if you have suggestions on how to make the changes less complex, I'd be very appreciative of your suggestions.
I would argue that a system where those <20 nanoseconds make a difference should probably be implementing their own UUID type.
That's certainly an option, and one I'm willing to pursue if we cannot come to an agreement on how to move forward. However, the work required to significantly optimize this library is fairly small and helps the entire Go community. I feel it would be a loss to the wider community to introduce yet another UUID library, and I'd love to work with you to avoid that possibility.
Thanks again for you feedback, Zach
@ztstewart
By "general purpose", I meant that I think the vast majority of this library's consumers (the general case) won't actually care about those few nanoseconds and/or whether or not those 16 bytes escape to the heap when a UUID parsing error occurs.
I would argue the opposite regarding generality: without these optimizations, the library becomes less generally useful
I also have concerns that these micro-optimizations could well turn out to be architecture/OS/platform/Go version/etc. dependent.
For sure, though, this is my opinion - and you know what they say about those :-) - and I'm definitely not any sort of "owner" over the project.
@dylan-bourque while I agree that code should be as simple as possible, this code is not overly complex. And in addition it is used only in marshaling of the UUID to output. I would be OK with this being merged if it can be proven safe.
@lunemec I still think this change is, for lack of a better word, unnecessary. It adds more lines of warning comments than actual code and is optimizing for stack vs. heap allocations done inside of the standard library (which are subject to change in later Go releases). It also introduces the use of unsafe
to what is a very simple library for the sake of those micro-optimizations.
I won't block this PR but I'm also not going to approve it.
@ztstewart I guess since you put effort into optimizing that this is something important for your use-case?
@dylan-bourque maybe adding this as compile tag (only on-demand) would satisfy you, and it would allow users who want extra fast UUID parsing to use the unsafe method?
As I said, I won't block the change if others approve and are willing to accept it.
Thanks for the contribution here. For the most part, this review has sat idle for a while due to some strong objections about increase of code complexity and use of unsafe methods for unclear performance benefit. Going to close this. If anybody has any objections, we can continue to talk about the changes here, and the PR can be re-opened.
UUIDs are implemented as a fixed length array, giving great performance.
However, while profiling, I've found a few cases in the parsing code that can lead to heap allocations:
We can avoid these by doing the following:
This PR implements both of these strategies. As a result, various parsing methods are faster.
Old benchmark results:
New benchmark results:
Diff: