Closed adg closed 1 year ago
Alternative API:
type Server struct {
once sync.TryOnce[*sql.DB]
}
func (s *Server) db() (*sql.DB, error) {
return s.once.Do(func() (*sql.DB, error) {
return sql.Open("sqlite", dbPath)
})
}
I have one of these in https://github.com/carlmjohnson/syncx. I think it’s suitable for the standard library, but it should probably come as part of a std wide addition of generics, not a one off.
I played around with this in a large codebase. It's a nice idea.
I do prefer @icholy's API suggestion. It seems that the ergonomics are about the same and also that it would be a little harder to accidentally misuse. I could see people writing
func GetFoo() (*Foo, error) {
return OnceFunc(getFoo)
}
whereas the equivalent mistake with TryOnce
seems less likely (especially if people are used to sync.Once
already). And in general, it just seems little nicer/more typical to have the shared state be a struct value rather than a function; for example, consider
var fooOnce TryOnce[*Foo]
func GetFoo() (*Foo, error) { return fooOnce.Do(getFoo) }
vs.
var getFooOnce = OnceFunc(getFoo)
func GetFoo() (*Foo, error) { return getFooOnce() }
When I was looking at how sync.Once
gets used in my codebase, I found that I could categorize them roughly four ways:
os.Exit
, log.Fatal
, etc).(T, error)
.For (3) and (4), sync.Once
seems about optimal right now. This proposal helps with (2). But I noticed that (1) is even more common than (2). So maybe having both would be best:
type OnceVal[T any] struct { /* ... */ }
func (o *OnceVal[T]) Do(f func() T) T
type OnceError[T any] struct { /* ... */ }
func (o *OnceError[T]) Do(f func() (T, error)) (T, error)
or even
type Once1[T any] struct { /* ... */ }
func (o *Once1[T]) Do(f func() T) T
type Once2[T1, T2 any] struct { /* ... */ }
func (o *Once2[T1, T2]) Do(f func() (T1, T2)) (T1, T2)
FWIW, I wrote the closure version and find it much more ergonomic. It wouldn't occur to me to write
func GetFoo() (*Foo, error) {
return OnceFunc(getFoo)
}
since it obviously should be var GetFoo = sync.OnceFunc(getFoo)
, but it's hard to predict what kind of error people will make en mass until it's in the wild.
Also I don't think it makes sense for this to return an error for the reasons given here: https://github.com/golang/go/issues/53696#issuecomment-1176238913
I also had a situation when I wanted to make some initialization in my app lazy (because it runs on Lambda and cold start is a pain) but the initialization could potentially fail, but once that's the situation, there's no good API (at least that I've seen). The errors have to be dealt with somewhere. (If they could be ignored, regular sync.Once would work.) If the system assumes initialization has already happened, the path to deal with the error isn't there and all you can do is crash. If it doesn't make that assumption, you need to handle the error every time you interact with the object, so it's not really "initialized" just "gettable".
@carlmjohnson wrote:
The errors have to be dealt with somewhere. (If they could be ignored, regular sync.Once would work.) If the system assumes initialization has already happened, the path to deal with the error isn't there and all you can do is crash. If it doesn't make that assumption, you need to handle the error every time you interact with the object, so it's not really "initialized" just "gettable".
I proposed a OnceFunc
that returns (T, error)
because my code, and other code I have observed in the wild, often stores an initialization error alongside the initialized value.
A few examples I quickly pulled from the Go core (there are many more, I didn't want to look exhaustively):
Your circumstances may call for a different error handling mechanism (crashing), but others prefer to handle the error every time the resource is requested. Then upstream callers can decide whether it's crash-worthy or not.
I think that without the error
return value the proposed OnceFunc
is not very useful. Otherwise you should just do the initialization earlier, since the program should crash if the resource isn't available anyway, or you don't care about handling the error in which case (as you say) the existing sync.Once
gives you almost everything you need already.
@icholy suggested:
Alternative API:
Let me just expand your suggested API to do exactly what the other examples are doing, so that it's a fair comparison:
type Server struct {
dbPath string
dbOnce sync.TryOnce[*sql.DB]
}
func NewServer(dbPath string) *Server {
return &Server{dbPath: dbPath}
}
func (s *Server) db() (*sql.DB, error) {
return s.dbOnce.Do(func() (*sql.DB, error) {
return sql.Open("sqlite", s.dbPath)
})
}
func (s *Server) DoSomething() error {
db, err := s.db()
...
}
I like that the type has a usable zero value, which means you don't need a constructor for Server
just to set up this value (but we do need the something to set the dbPath
field, or whatever other state goes into the closure). However in exchange for that we need a wrapper function (the db
method here), so we immediately return to equal in terms of boilerplate.
I like that baking the once-ness into the type declaration, instead of just using a plain closure, gives some indication on sight that it's a once-initialized value.
Here's a variation on what you suggest, which is arguably less boilerplatey, as we don't need to store the closure state anywhere. TheNewOnceFunc
function returns a *OnceFunc[T]
with a Do() (T, error)
method:
type Server struct {
db *sync.OnceFunc[*sql.DB]
}
func NewServer(dbPath string) *Server {
return &Server{
db: sync.NewOnceFunc(func() (*sql.DB, error) {
return sql.Open("sqlite", dbPath)
}),
}
}
func (s *Server) DoSomething() error {
db, err := s.db.Do()
...
}
But to immediately argue against this: an advantage of baking the state (dbPath
, in in this example) into the once function itself is that we don't expect that changing it later will have any effect. For instance, if we changed dbPath
after the first call to the db
function we might expect to access a different database. Putting that state in the closure makes it harder to make that mistake.
With all that said, my main objection to these proposals (compared to my original proposal) is that they make it harder to substitute a different initialization function that doesn't use OnceFunc
. A central advantage of my original proposed API is that a sync.OnceFunc
can wrap any func() (T, error)
transparently, so that downstream callers don't know (and shouldn't care) that they're invoking it only once. In my experience this is a valuable property.
@cespare my instinct is that having fewer things is better than more things. If someone wants to call a OnceFunc
and ignore the error, they could just ignore the error.
However in exchange for that we need a wrapper function (the
db
method here)
I do think that a db method is nicer than a function as a struct field. That strikes me as unusual-looking about your original example.
Also, the original example doesn't look like the common use cases I see for sync.Once
. I most often see sync.Once
used for package-level initialization. That's what most of the links you located in https://github.com/golang/go/issues/56102#issuecomment-1272643083 are as well.
So we have something like this today:
var state struct {
once sync.Once
val *Thing
err error
}
func State() (*Thing, error) {
state.once.Do(func() {
state.val, state.err = loadState()
})
return state.val, state.err
}
func loadState() (*Thing, error) { /* ... */ }
With OnceFunc
, it could be
var loadStateOnce = sync.OnceFunc(loadState)
func State() (*Thing, error) {
return loadStateOnce()
}
func loadState() (*Thing, error) { /* ... */ }
and with TryOnce
, it would be
var loadStateOnce sync.TryOnce[*Thing]
func State() (*Thing, error) {
return loadStateOnce.Do(loadState)
}
func loadState() (*Thing, error) { /* ... */ }
Neither has a real advantage in terms of conciseness. Adjusting either of these to get rid of the once-ness would be trivial.
One thing I notice about OnceFunc
is that it might be tempting to save a few lines and write:
var State = sync.OnceFunc(loadState)
func loadState() (*Thing, error) { /* ... */ }
I think this would be a mistake, though. So both in your original example and here, OnceFunc
seems to promote the use of function values in places where we would idiomatically use methods or normal functions in Go. OnceFunc
seems like a function that would be very much at home in the functional languages I use, but not Go. (For instance it is available as memoize
on a 0-ary function in Clojure.)
Actually the case in which I use it is globals; I chose the struct form as it's as slightly more complex context.
This is how you would use OnceFunc
for your state
example:
var state = sync.OnceFunc(func() (*Thing, error) {
/* ... */
}
This IMO is way more concise than any of the alternatives.
(Sorry sent this before I had finished writing it.)
I think that function variables are fine in the context of private globals. If you wanted to export it and protect it from mutation outside the package, you could write:
func State() (*Thing, error) { return loadState() }
var loadState = sync.OnceFunc(func() (*Thing, error) {
/* ... */
})
OnceFunc seems to promote the use of function values in places where we would idiomatically use methods or normal functions in Go.
I think this kind of argument is not vey helpful in this context. We have new possibilities with generics; we should argue on the pros/cons, not by the established practices that pre-date the tools we have available today.
I think this kind of argument is not vey helpful in this context. We have new possibilities with generics; we should argue on the pros/cons, not by the established practices that pre-date the tools we have available today.
Ah -- I'd been taking it as self-evident that promoting function variables over functions should be avoided. My mistake.
I think that replacing functions with variables is a poor practice on the merits:
function
or var
.)Therefore, I think that APIs should not encourage using vars where functions or methods would have traditionally been used, and thus I think that a struct type is a better API here than a higher-order function.
I'd been taking it as self-evident that promoting function variables over functions should be avoided.
I think this is true, for most uses of global function variables, and most certainly exported ones. I'd go further to suggest that global variables as a whole are best avoided where possible.
But I don't think function variables should be avoided as a whole. For instance, look at this pitfall you describe:
Using vars instead of functions further encourages mutating the var for testing purposes, an ill-advised practice that interferes with test parallelization and makes debugging harder.
One good solution to this is to actually use a function variable, rather than mutating global state. For example, if you want to test some code that works with time, passing it a mock now func() time.Time
(either as a function argument or by setting a struct field) allows you to control precisely what that function does in your tests.
So from here I'm assuming that function variables do have some value, and should be used where appropriate.
In the context of this proposal, where OnceFunc
might be used to set a global function variable it would be replacing the use of three global variables. From your example:
var state struct {
once sync.Once
val *Thing
err error
}
I think, on the whole, if you're writing programs with globals like this then you're already vulnerable to most of the pitfalls you describe. If anything, OnceFunc
makes it harder to make a mess of it. (This is why I chose to use the Server
struct example, btw.)
Other concerns you raised:
Invoking a function value through a var is slower than calling a function.
That may be true in isolation but we'd need to benchmark the different approaches described here to make any efficiency arguments.
The stack trace you get when calling a function through a var is less helpful (it doesn't include the var location).
These stack traces seem equally helpful to me: before and after
This proposal has been added to the active column of the proposals project and will now be reviewed at the weekly proposal review meetings. — rsc for the proposal review group
This is clearly a very common operation that we should make easier. The only discussion seems to be whether to include the error in the function signature. So maybe there should be two forms: one with the error and one without. Generalizing "with error" to two values may make sense. But if so, what are the names? OnceVal / OnceError and Once1 / Once2 both seem a bit strange. Maybe we should find a name that's not "Once"?
sync.Lazy[T], sync.Lazy2[T] - lazy is maybe overused, or maybe it should have the function ahead of time sync.Memo[T], sync.Memo2[T] - memoizing is usually parameterized, and this isn't sync.Cache[T], sync.Cache2[T] - but caches can be cleared, and this can't
So lots of ideas, none of them great.
What about a single type with 2 methods:
type Memo[T any] struct {
val T
err error
once sync.Once
}
func (m *Memo[T]) Do(f func() T) T {
m.once.Do(func() {
m.val = f()
})
return m.val
}
func (m *Memo[T]) DoErr(f func() (T, error)) (T, error) {
m.once.Do(func() {
m.val, m.err = f()
})
return m.val, m.err
}
This reminds me of #37739, but implemented with generics instead of as a language change and tailored specifically for use with concurrency. Maybe it makes sense to split the concurrency support out? In other words, have a lazy value interface somewhere non-concurrency specific and a sync.Lazy
struct that wraps it to make it thread-safe?
package something
type Lazy[T any] interface {
Eval() T
}
package sync
type Lazy[T any] struct {
lazy something.Lazy[T]
}
To try to move things forward, what do people think of
type OnceValue[T any] struct { ... }
func (*OnceValue[T]) Do(func() T) T
type OnceValueErr[T any] struct { ... }
func (*OnceValueErr[T]) Do(func() (T, error)) (T, error)
?
It seems like the name should begin with Once so that people find it when they are looking for sync.Once.
I think that two types is overkill. People can easily ignore the error return if they don't need it.
edit: I like the two type approach the best.
I prefer icholy's suggestion of one type with two methods to two types.
Having one type with two methods leaves open the possibility to use both with one OnceValue. That seems like a source of bugs. I can imagine legitimate ways to use both methods together, but the (T, error) form subsumes them.
Just to sum it up, here are the variations:
One type, one method:
type OnceValue[T any] struct { ... }
func (*OnceValue[T]) Do(func() (T, error)) (T , error)
Two types, one method:
type OnceValue[T any] struct { ... }
func (*OnceValue[T]) Do(func() T) T
type OnceValueErr[T any] struct { ... }
func (*OnceValueErr[T]) Do(func() (T, error)) (T, error)
One type, two methods:
type OnceValue[T any] struct { ... }
func (*OnceValue[T]) Do(func() T) T
func (*OnceValue[T]) DoErr(func() (T, error)) (T, error)
The one type, two methods version seems fine to me. To prevent misuse Do
could panic if DoErr
was previously called.
One type, two methods seems out-of-place, because you have to call one of the methods consistently to use it correctly. If there are goroutines racing (this is package sync) and one calls Do and the other calls DoErr, then in at least one case we have a problem: when DoErr wins, caches an error, and then Do runs and can't return the error. That suggests that any valid use should either always call Do or always call DoErr. To avoid latent bugs that only show up in production, the implementation should probably panic any time it observes both methods being used.
So really there are two kinds of OnceValue: the kind that can only use Do, and the kind that can only use DoErr. Giving them the same Go type means the compiler can't help you make sure you are using the type correctly. In contrast, what we usually do in Go is use different types for different kinds of values, and then the type system and the compiler do help you, and this possible runtime panic is eliminated at compile time.
I think that excludes "one type, two methods".
"One type, one method" seems not quite right, because sometimes we will be caching things that can't possibly fail, and it's annoying to have to discard the error that can't happen anyway. Yes, code that can fail is common, but so is code that can't fail.
That leaves "two types, one method", which is why I suggested OnceValue and OnceValueErr.
Adding to what @rsc says in the comment just above--which I agree with--we do sometimes want to standardize on a method signature with an error
return that may always be nil
in some situations. That is often the right choice when we expect or know it will be common to define an interface for that method and we expect some implementations to need to return errors even if not all implementations will need to. io.Writer
is a good example. The Write
method returns an error, but there are implementations in the standard library and elsewhere that we know always return nil
errors (e.g. *bytes.Buffer
).
Is there enough value in both types of OnceValue implementing the same interface to make it worth the cost of sometimes ignoring errors (and making sure that's the correct choice) or sometimes checking for errors that will never happen?
My intuition in this case is that, no, there isn't much value in implementing a common interface here.
I implemented Russ' OnceValueErr
type and updated the original example to use it:
type Server struct {
dbPath string
dbOnce sync.OnceValueErr[*sql.DB]
}
func NewServer(dbPath string) *Server {
return &Server{
dbPath: dbPath,
}
}
func (s *Server) db() (*sql.DB, error) {
return s.dbOnce.Do(func() (*sql.DB, error) {
return sql.Open("sqlite", s.dbPath)
})
}
func (s *Server) DoSomething() error {
db, err := s.db()
if err != nil {
return err
}
_ = db // do something with db
}
Compared to the original proposal of a OnceFunc
function that returns a function value:
type Server struct {
db func() (*sql.DB, error)
}
func NewServer(dbPath string) *Server {
return &Server{
db: sync.OnceFunc(func() (*sql.DB, error) {
return sql.Open("sqlite", dbPath)
}),
}
}
func (s *Server) DoSomething() error {
db, err := s.db()
if err != nil {
return err
}
_ = db // do something with db
return nil
}
I think that the proposed OnceValueErr
type has several disadvantages over my proposed OnceFunc
:
OnceValueErr
requires the use of another method through which you call its Do
method. With OnceFunc
there's just a db
function value, and that's it.OnceValueErr
requires you to specify the type T
in both the OnceValueErr
type spec and also in the function passed to Do
. OnceFunc
infers the type from the given closure.OnceValueErr
is harder than OnceFunc
to mock out in tests. With OnceFunc
you can just substitute a different function value. With OnceValueErr
you'd need to further abstract away the invocation of the Do
method.We use the io.Reader
interface so that we can easily compose code that implements or uses io.Reader
, without those different pieces of code needing to know about one another. The use of this well-defined interface is a great simplifying force. With OnceFunc
I am proposing that we use the most well-defined interface in Go: the function.
In short: the purpose of my proposed OnceFunc
is ergonomics. I see little benefit to the proposed OnceValue
/OnceValueErr
over the existing sync.Once
.
OnceValueErr requires the use of another method through which you call its Do method. With OnceFunc there's just a db function value, and that's it.
If you were writing this code without the "once" behavior, a db
method would be the idiomatic approach.
type Server struct {
dbPath string
}
func NewServer(dbPath string) *Server {
return &Server{
dbPath: dbPath,
}
}
func (s *Server) db() (*sql.DB, error) {
// TODO: reuse connection
return sql.Open("sqlite", s.dbPath)
}
func (s *Server) DoSomething() error {
db, err := s.db()
if err != nil {
return err
}
_ = db // do something with db
}
I think we have established that we should probably support both T and (T, error) results, and that those should be different APIs, not a single one that appears to support both but only supports one at a time.
As @adg points out, his issue description was about a closure-based API, but the conversation shifted almost immediately (in the very first comment) to an object-based API. I didn't notice the shift when I started commenting, so we haven't discussed whether the API should be closure-based or object-based. My apologies for completely missing that change and not making sure we discussed that part of the proposal. Let's take a look at that dimension of the decision next.
I have a preliminary catalog of all the sync.Once uses in the main repo and will do more analysis and post the results in the morning.
There are 157 sync.Once declarations in the main repo, and there are a few different patterns that uses can be grouped into.
The simplest use of sync.Once is to cause a side effect at most once, no matter how many times the code runs.
For example, each side of an io.Pipe can be closed for reading or writing, but either way I/O is over, at that point. The implementation signals this to goroutines blocked in select by closing the p.done channel. Of course, a channel must only be closed once, so the code uses a sync.Once:
type pipe struct {
...
once sync.Once
done chan struct{}
...
}
func Pipe() (*PipeReader, *PipeWriter) {
p := &pipe{
wrCh: make(chan []byte),
rdCh: make(chan int),
done: make(chan struct{}),
}
return &PipeReader{p}, &PipeWriter{p}
}
func (p *pipe) closeRead(err error) error {
...
p.once.Do(func() { close(p.done) })
...
}
func (p *pipe) closeWrite(err error) error {
...
p.once.Do(func() { close(p.done) })
...
}
If there were a OnceFunc0 (which we haven't discussed, but let's just try it), this code would change to:
type pipe struct {
...
cancel func()
done chan struct{}
...
}
func Pipe() (*PipeReader, *PipeWriter) {
p := &pipe{
wrCh: make(chan []byte),
rdCh: make(chan int),
done: make(chan struct{}),
}
p.cancel = sync.OnceFunc0(func() { close(p.done) })
return &PipeReader{p}, &PipeWriter{p}
}
func (p *pipe) closeRead(err error) error {
...
p.cancel()
...
}
func (p *pipe) closeWrite(err error) error {
...
p.cancel()
...
}
Some of the cleanup here would have been possible in the original by defining:
func (p *pipe) cancel() {
p.once.Do(func() { close(p.done) })
}
instead of repeating that phrase in closeRead and closeWrite. So the fundamental difference between the cleaned-up original and the OnceFunc0 version is that the sync.Once itself and its cached data are not exposed: they are hidden inside the func.
We might wonder about what a hypothetical OnceValue0 would look like, but that's just sync.Once.
Another common pattern is declaring a sync.Once next to the data it protects and then requiring users of that data to call the init function before using the data.
For example, compress/flate needs a huffman decoding table that we want to compute at runtime, to keep binary sizes down, but we also don't want to compute it at init time, to keep startup latency down. The code looks like:
var fixedOnce sync.Once
var fixedHuffmanDecoder huffmanDecoder
func fixedHuffmanDecoderInit() {
fixedOnce.Do(func() {
// These come from the RFC section 3.2.6.
var bits [288]int
for i := 0; i < 144; i++ {
bits[i] = 8
}
for i := 144; i < 256; i++ {
bits[i] = 9
}
for i := 256; i < 280; i++ {
bits[i] = 7
}
for i := 280; i < 288; i++ {
bits[i] = 8
}
fixedHuffmanDecoder.init(bits[:])
})
}
func NewReader(r io.Reader) io.ReadCloser {
fixedHuffmanDecoderInit()
...
}
func (f *decompressor) nextBlock() {
...
if ... {
// compressed, fixed Huffman tables
f.hl = &fixedHuffmanDecoder
}
...
}
If there were a OnceFunc1, this code could have been written instead like:
var fixedHuffman = sync.OnceFunc1(newFixedHuffmanDecoder)
func newFixedHuffmanDecoder() *huffmanDecoder {
return ...
}
func NewReader(r io.Reader) io.ReadCloser {
// DELETED: fixedHuffmanDecoderInit()
...
}
func (f *decompressor) nextBlock() {
...
if ... {
// compressed, fixed Huffman tables
f.hl = fixedHuffman()
}
...
}
Again the fundamental difference in the OnceFunc1 version is that the sync.Once and its cached data are not exposed. The original separated the one-time initialization from the use, which might lead to bugs where the data is accessed without the initialization step. In contrast, the OnceFunc1 makes those kinds of bugs impossible.
Some of the cleanup forced by OnceFunc1 is possible in the original by declaring:
func fixedHuffman() *huffmanDecoder {
fixedHuffmanDecoderInit()
return &fixedHuffmanDecoder
}
and then making sure code does not use fixedHuffmanDecoderInit or fixedHuffmanDecoder otherwise.
If we used a OnceValue instead, we'd start with the original and replace
var fixedOnce sync.Once
var fixedHuffmanDecoder huffmanDecoder
with
var fixedHuffmanDecoder sync.OnceValue[*huffmanDecoder]
and
f.hl = &fixedHuffmanDecoder
with
f.hl = fixedHuffmanDecoder.Do(newFixedHuffmanDecoder)
or maybe we would introduce
func fixedHuffman() *huffmanDecoder {
return fixedHuffmanDecoder.Do(newFixedHuffmanDecoder)
}
and use
f.hl = fixedHuffman()
again.
Note that callers still need to know what to pass to Do, or we have to introduce a wrapper function that encapsulates that detail.
The most important observation seems to be that sync.Once permits separate initialization and use while OnceFunc would force callers not to do that, and in general this seems to clean up the code.
The same pattern also happens where the sync.Once and the cached data are fields in a struct.
The final common pattern is a variant of the previous one, where the code already has the wrappers that hide the sync.Once from calling code.
For example, internal/testenv has:
var (
gorootOnce sync.Once
gorootPath string
gorootErr error
)
func findGOROOT() (string, error) {
gorootOnce.Do(func() {
... set gorootPath, gorootErr
})
return gorootPath, gorootErr
}
All code calls findGOROOT. No code is expected to use the global variables: they are essentially private to findGOROOT.
Again the same pattern also happens where the sync.Once and the cached data are fields in a struct.
If we used OnceValueError, this would become:
var gorootOnce sync.OnceValueError[string]
func findGOROOT() (string, error) {
return gorootOnce.Do(func() (string, error) {
...
})
}
If we used OnceFunc2, this would become:
var findGOROOT = sync.OnceFunc2(func() (string, error) {
...
})
The code we started with was fairly clean. The only possible complaint is that gorootOnce, gorootPath, and gorootErr are exposed and could be misused.
The OnceValueError version hides gorootPath and gorootErr but leaves gorootOnce. The OnceFunc2 version hides gorootOnce too.
The OnceFunc2 version does have the downside that there is no name for the function in the stack trace if it crashes. It might be better for debuggability to adopt an idiom like:
var findGOROOT = sync.OnceFunc2(findGOROOTUncached)
func findGOROOTUncached() (string, error) {
...
}
But instead of forcing such changes on users, we can also adjust the func closure name heuristics to put findGOROOT into the name in the anonymous example.
In a struct with methods, the declaration would be a little different. For example, internal/lazyregexp has:
type Regexp struct {
str string
once sync.Once
rx *regexp.Regexp
}
func (r *Regexp) re() *regexp.Regexp {
r.once.Do(r.build)
return r.rx
}
func (r *Regexp) build() {
r.rx = regexp.MustCompile(r.str)
r.str = ""
}
func New(str string) *Regexp {
return &Regexp{str: str}
}
func (r *Regexp) MatchString(s string) bool {
return r.re().MatchString(s)
}
The OnceValue version would be:
type Regexp struct {
str string
rx sync.OnceValue[*regexp.Regexp]
}
func (r *Regexp) re() *regexp.Regexp {
return r.rx.Do(r.build)
}
func (r *Regexp) build() *regexp.Regexp {
s := r.str
r.str = ""
return regexp.MustCompile(s)
}
func New(str string) *Regexp {
return &Regexp{str: str}
}
func (r *Regexp) MatchString(s string) bool {
return r.re().MatchString(s)
}
And the OnceFunc version would be:
type Regexp struct {
re func() *regexp.Regexp
}
func New(str string) *Regexp {
return &Regexp{
re: sync.OnceFunc(func() *regexp.Regexp {
return regexp.MustCompile(str)
})
}
}
func (r *Regexp) MatchString(s string) bool {
return r.re().MatchString(s)
}
There are a few unusual uses of sync.Once that are at least worth noting.
x/net/http2 has code like this:
type http2clientStream {
...
abortOnce sync.Once
abort chan struct{} // closed to signal stream should end immediately
abortErr error // set if abort is closed
...
}
func (cs *http2clientStream) abortStreamLocked(err error) {
cs.abortOnce.Do(func() {
cs.abortErr = err
close(cs.abort)
})
...
}
func (cc *http2ClientConn) RoundTrip(req *Request) (*Response, error) {
...
cs := &http2clientStream{
...
abort: make(chan struct{}),
...
}
...
}
This code is using the abortOnce for the side effect of closing cs.abort, like in the pipe example, but it is also saving the error that caused the close. I don't see an obvious way to change this code to use OnceFunc, because there's no way to pass the err into the func the first time it is called. The pipe code had the same problem but used a separate write-once abstraction to deal with the store of the error.
I suppose if we had a sync.WriteOnce[T] then the code could be written as:
type http2clientStream {
...
abortClose func()
abort <-chan struct{} // closed to signal stream should end immediately
abortErr sync.WriteOnce[error] // set if abort is closed
...
}
func (cs *http2clientStream) abortStreamLocked(err error) {
cs.abortErr.Store(err)
cs.abortClose()
...
}
func (cc *http2ClientConn) RoundTrip(req *Request) (*Response, error) {
...
abort := make(chan struct{})
cs := &http2clientStream{
...
abort: abort,
abortClose: sync.OnceFunc0(func() {close(abort)}),
...
}
...
}
As another example, cmd/go/internal/script has:
func Program(name string, cancel func(*exec.Cmd) error, waitDelay time.Duration) Cmd {
var (
...
lookPathOnce sync.Once
path string
pathErr error
)
if filepath.IsAbs(name) {
lookPathOnce.Do(func() { path = filepath.Clean(name) })
...
}
return Command(
...,
func(s *State, args ...string) (WaitFunc, error) {
lookPathOnce.Do(func() {
path, pathErr = exec.LookPath(name)
})
if pathErr != nil {
return nil, pathErr
}
return startCommand(..., path, ...)
})
}
This code is calling lookPathOnce.Do in two different places, with two different functions, depending on the form of the name passed to Program. The conditional call that happens when filepath.IsAbs(name) is true disables the "normal" call below.
This is a bit hard to reason about, and perhaps it would be better to write the code in a more conventional way, like this:
func Program(name string, cancel func(*exec.Cmd) error, waitDelay time.Duration) Cmd {
var (
...
lookPathOnce sync.Once
path string
pathErr error
)
if filepath.IsAbs(name) {
path = filepath.Clean(name)
...
}
return Command(
...,
func(s *State, args ...string) (WaitFunc, error) {
lookPathOnce.Do(func() {
if path == "" {
path, pathErr = exec.LookPath(name)
}
})
if pathErr != nil {
return nil, pathErr
}
return startCommand(..., path, ...)
})
}
With OnceFunc2, the path and pathErr variables would be hidden, but the code could use different functions in the different cases:
func Program(name string, cancel func(*exec.Cmd) error, waitDelay time.Duration) Cmd {
var lookPath func() (string, error)
if filepath.IsAbs(name) {
path := filepath.Clean(name)
lookPath = func() (string, error) { return path, nil }
} else {
lookPath = sync.OnceFunc2(func() (string, error) { return exec.LookPath(name) })
}
return Command(
...,
func(s *State, args ...string) (WaitFunc, error) {
path, err := lookPath()
if err != nil {
return nil, err
}
return startCommand(..., path, ...)
})
}
I'll post my thoughts about all these in the next comment. This comment is scoped to just presenting the data I gathered.
When I reread @adg's top comment and started thinking about the closure-based API, I was fairly skeptical. I'm a bit uncomfortable with the type of this functionality being a plain func value instead of a thing with a name. And storing what amount to methods as plain struct fields feels very JavaScripty. So I really wasn't expecting much.
Going through all the uses of sync.Once in the main repo, I was struck by how complex many of them are to reason about. The clearest code is the well-encapsulated uses (pattern 3), but not everyone knows to write the code that way. I think it even took us many years to develop that pattern. The encapsulated uses are mostly in newer code.
We should definitely do something here. 84% of the sync.Once uses are computing lazy values and would be better expressed with something more tailored. In the typical patterns, it seems to me that OnceFunc helps more than OnceValue does, and I think it makes sense to call it Lazy instead of OnceFunc, which I'll discuss more below. (I mention it now because the examples coming up are going to use Lazy.)
As noted in the previous comment, OnceValue hides the values but not the sync.Once. The best practice is still to wrap any use in a separate function or method that code calls to obtain the values. You have to discover that best practice, rather than writing v := x.v.Do(computeV)
at each use.
Consider again this example from pattern 3:
var (
gorootOnce sync.Once
gorootPath string
gorootErr error
)
func findGOROOT() (string, error) {
gorootOnce.Do(func() {
... set gorootPath, gorootErr
})
return gorootPath, gorootErr
}
If OnceValue is always used with the function wrapper pattern, then the uses end up essentially the same as Lazy's uses, except you have to write out the function wrapper each time, like this example from pattern 3:
var gorootOnce sync.OnceValueError[string]
func findGOROOT() (string, error) {
return gorootOnce.Do(func() (string, error) {
...
})
}
Lazy ends up codifying the pattern in a way that you can't avoid, ensuring clean uses without requiring everyone to learn and write the boilerplate:
var findGOROOT = sync.Lazy2(func() (string, error) {
...
})
If we're going to try to make uses of sync.Once shorter and less error-prone, it seems like OnceValue is only half a fix, while Lazy is the whole fix. So I'm inclined toward the function version.
OnceValue is a partial fix in a second way too: it only covers the 84% of sync.Once uses that compute a value. It doesn't cover the remaining 16% that don't compute a value, because OnceValue0 is just sync.Once. But those are still improved by using Lazy instead. For example compare:
type pipe struct {
...
once sync.Once
done chan struct{}
}
func (p *pipe) cancel() {
p.once.Do(func() { close(p.done) })
}
func Pipe() (*PipeReader, *PipeWriter) {
p := &pipe{
wrCh: make(chan []byte),
rdCh: make(chan int),
done: make(chan struct{}),
}
return &PipeReader{p}, &PipeWriter{p}
}
with:
type pipe struct {
...
cancel func()
done <-chan struct{}
}
func Pipe() (*PipeReader, *PipeWriter) {
done := make(chan struct{})
p := &pipe{
wrCh: make(chan []byte),
rdCh: make(chan int),
cancel: sync.Lazy(func() { close(done) }),
done: done,
}
return &PipeReader{p}, &PipeWriter{p}
}
The Lazy version is shorter and lets the struct field done change to be a <-chan, to prevent misuse. It seems strictly better than the version with sync.Once. So Lazy would let us clean up a larger fraction of sync.Once instances than OnceValue would.
Back in https://github.com/golang/go/issues/56102#issuecomment-1285943596 we said that Lazy wasn't a good name because it should be used for something that has already captured the code that runs. Indeed, Lazy would be a bad name for OnceValue, but it seems like a good name for the closure-based version:
func Lazy(f func()) func()
func Lazy1[T any](f func() T) func() T
func Lazy2[T1, T2 any](f func() (T1, T2)) func() (T1, T2)
Then we'd have code like:
p.cancel = sync.Lazy(func() { close(p.done) })
var fixedHuffman = sync.Lazy1(newFixedHuffmanDecoder)
var findGOROOT = sync.Lazy2(func() (string, error) {
...
})
return &Regexp{
re: sync.Lazy1(func() *regexp.Regexp {
return regexp.MustCompile(str)
},
}
if filepath.IsAbs(name) {
path := filepath.Clean(name)
lookPath = func() (string, error) { return path, nil }
} else {
lookPath = sync.Lazy2(func() (string, error) { return exec.LookPath(name) })
}
These look clear to me, and also far less bug-prone than what we're doing today.
So I'm in favor of taking the func path and using the name sync.Lazy. I think we can stop at 2 results (no Lazy3).
One final note: a few people have mentioned that objects with methods are more "idiomatic" in Go than functions, but we do from time to time learn better ways to do things. For example
if i := strings.Index(s, ":"); i >= 0 {
k, v := s[:i], s[i+1:]
...
}
used to be idiomatic in Go, but now
if k, v, ok := strings.Cut(s, ":"); ok {
...
}
is instead. One could have argued against strings.Cut by saying that explicit indexing is idiomatic. Idioms evolve.
I find this analysis compelling, it seems like the func version is better. The name Lazy throws me a bit, as I usually expect Lazy
I notice that the issue of needing two variants of generic things, one for T
and one for (T, error)
, keeps coming up in numerous places. Maybe it's time to reevaluate tuples again? If not, maybe it would make sense to put a type Result[T any] struct { Val T; Err error }
type into the errors
package just so that it can be standardized to be used by generic data structures and implementations?
I can write this up as a separate proposal if you want, but I wanted to mention it here because I feel like jumping straight to Lazy
+ Lazy2
might be a bit quick given that that problem seems, to me at least, to need solving more generally.
At this point I don't think it will help to file a proposal about tuples. It's too soon to be redesigning generics. We need to use them for longer first.
The OnceValueError version does have the downside that there is no name for the function in the stack trace if it crashes.
I believe that should be "OnceFunc2", not "OnceValueError".
cancel: sync.Lazy(func() { close(done) }),
I'm sure I could get used to this, but at first glance it's kind of weird. Using sync.Once
is clear: the function is run once. Here we have a function that should be run once, but we're calling it a lazy function. To me a lazy function is something that computes a value when that value is needed. But close
doesn't compute a value at all. It's strange to call it lazily.
Instead of sync.Lazy / sync.Lazy1 / sync.Lazy2, the names could be sync.OnceFunc / sync.OnceValue / sync.OncePair. It also helps discoverability as you type sync.Once in your IDE and the Func/Value/Pair autocompletions pop up.
@rsc thanks for the analysis. Personally I don't find the examples compelling -- the OnceFunc
versions are certainly clever, but they do not seem clearer or less prone to misuse to me. Yes, some internal state is hidden inside a closure, but that function value itself is now a var or struct field whereas previously it would have been an immutable function or method.
It might be better for debuggability to adopt an idiom like:
var findGOROOT = sync.OnceFunc2(findGOROOTUncached) func findGOROOTUncached() (string, error) { ... }
Well, if we did that, wouldn't it negate most of the claims about misuse resistance and boilerplate avoidance that OnceFunc
has in the first place? (Because now you can call the wrong function -- the one you are supposed to call is not the func
, but the var
! -- and you are back to declaring one var and one func per usage.)
Also, whenever the OnceFunc
argument is not a function literal, as in the above case or as in
var fixedHuffman = sync.OnceFunc1(newFixedHuffmanDecoder)
the type at play isn't mentioned at the call site. This is a nice property of
var fixedHuffmanDecoder sync.OnceValue[*huffmanDecoder]
Well, if we did that, ...
As I noted later, we can also fix the compiler's naming heuristic.
the type at play isn't mentioned at the call site.
I am assuming that
var fixedHuffman = sync.OnceFunc1(newFixedHuffmanDecoder)
would appear next to the definition of func newFixedHuffmanDecoder.
But if you really wanted to see the type on that line, you could write
var fixedHuffman = sync.OnceFunc1[*huffmanDecoder](newFixedHuffmanDecoder)
Thanks for the thorough analysis @rsc! That is very helpful.
And I appreciate your observation, which I hadn't considered:
Lazy ends up codifying the pattern in a way that you can't avoid, ensuring clean uses without requiring everyone to learn and write the boilerplate:
I think it's easy for us experienced Go programmers to assume people know the patterns. Making the pattern "use this function" greatly simplifies things.
I also appreciate the suggestion of the OnceFunc
that returns a closure that doesn't return any values. That's a nice touch.
WRT naming I share the concerns raised by @hherman1 and @ianlancetaylor. I think the name Once
is more precise and less overloaded than Lazy
. I propose a variation on @carlmjohnson's suggestion:
func OnceFunc(func()) func()
func OnceValue[T any](func() T) func() T
func OnceValues[T1, T2 any](func() (T1, T2)) func() (T1, T2)
Why are we making the pair version generic over both return values? It begs the question why stop at 2, and not have a three return variant. Whereas I think OnceErr makes it more clear why we stopped at two.
@hherman1 I can think of two reasons to use T2
instead of error
:
(T, error)
are among the most common, the other very common return value pairing is (T, bool)
. Supporting an arbitrary type makes this possible.I forgot about the bool variant, ok I’m convinced. But I wish there were a better name than OnceValues 🤔 Values plural sounds like a slice not a pair… but I can’t think of any.
WRT naming I share the concerns raised by @hherman1 and @ianlancetaylor. I think the name
Once
is more precise and less overloaded thanLazy
. I propose a variation on @carlmjohnson's suggestion:func OnceFunc(func()) func() func OnceValue[T any](func() T) func() T func OnceValues[T1, T2 any](func() (T1, T2)) func() (T1, T2)
Choosing OnceValues
for the third choice doesn't leave much room for other plural forms and is also pretty subtle. We might want to pick a name for the two value function that has a more natural progression to higher counts if we need it. Even if we don't expect to add it to the standard library, some people may need it locally and leaving them a choice of names that fit well with the standard library names might be nice.
I kind of like OnceValues precisely because it closes the door. But OnceFunc, OnceValue, OnceValue2 sound fine too.
I doubt it's possible to find names that everyone will like. I'm personally partial to OnceFunc
, Lazy
, and Lazy2
, with comments referencing the others from each's documentation, but I'd much prefer finding a way to not need Lazy2
at all.
func OnceFunc[T any](fn func() (T, error)) func() (T, error)
I just want it.
After much discussion, it sounds like people are generally happy with:
func OnceFunc(f func()) func()
func OnceValue[T any](f func() T) func() T
func OnceValue2[T1, T2 any](f func() (T1, T2)) func() (T1, T2)
Do I have that right?
I am fine with that. I think it's worth letting the bikeshed debate go on slightly longer than usual because what we do in the standard library will set a standard that packages outside the standard library will follow also… But the -2 convention is probably as good as any.
The ratio of API surface area to value provided seems off. Perhaps I'm underestimating how often this will get used.
This is a proposal for adding a generic
OnceFunc
function to thesync
package in the standard library.In my team's codebase we recently added this function to our private
syncutil
package:(I put this in the (temporary) module
github.com/adg/sync
, if you want to try it out.)This makes a common use of
sync.Once
, lazy initialization with error handling, more ergonomic.For example, this
Server
struct that wants to lazily initialize its database connection may usesync.Once
:While with
OnceFunc
a lot of the fuss goes away:Playground links: before and after.
If there is interest in this, then I suppose it should first live in
x/exp
(as with theslices
andmaps
packages) so that we can play with it.This seems to me like a great example of how generics can be used in the standard library. I wasn't able to find an overall tracking bug for putting generics in the standard library, otherwise I'd have referenced it here.