golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.38k stars 17.59k forks source link

sync: add OnceFunc, OnceValue, OnceValues #56102

Closed adg closed 1 year ago

adg commented 2 years ago

This is a proposal for adding a generic OnceFunc function to the sync package in the standard library.

In my team's codebase we recently added this function to our private syncutil package:

// OnceFunc returns a function that invokes fn only once and returns the values
// returned by fn. The returned function may be called concurrently.
func OnceFunc[T any](fn func() (T, error)) func() (T, error)

(I put this in the (temporary) module github.com/adg/sync, if you want to try it out.)

This makes a common use of sync.Once, lazy initialization with error handling, more ergonomic.

For example, this Server struct that wants to lazily initialize its database connection may use sync.Once:

type Server struct {
    dbPath string

    dbInit sync.Once
    dbVal  *sql.DB
    dbErr  error
}

func NewServer(dbPath string) *Server {
    return &Server{
        dbPath: dbPath,
    }
}

func (s *Server) db() (*sql.DB, error) {
    s.dbInit.Do(func() {
        s.dbVal, s.dbErr = sql.Open("sqlite", s.dbPath)
    })
    return s.dbVal, s.dbErr
}

func (s *Server) DoSomething() error {
    db, err := s.db()
    if err != nil {
        return err
    }
    _ = db // do something with db
    return nil
}

While with OnceFunc a lot of the fuss goes away:

type Server struct {
    db func() (*sql.DB, error)
}

func NewServer(dbPath string) *Server {
    return &Server{
        db: sync.OnceFunc(func() (*sql.DB, error) {
            return sql.Open("sqlite", dbPath)
        }),
    }
}

func (s *Server) DoSomething() error {
    db, err := s.db()
    if err != nil {
        return err
    }
    _ = db // do something with db
    return nil
}

Playground links: before and after.

If there is interest in this, then I suppose it should first live in x/exp (as with the slices and maps packages) so that we can play with it.

This seems to me like a great example of how generics can be used in the standard library. I wasn't able to find an overall tracking bug for putting generics in the standard library, otherwise I'd have referenced it here.

DmitriyMV commented 1 year ago

For what its worth, I don't think that sync.OnceValueError struct... and sync.OnceFunc func... should be competing. The problem, for me, with OnceFunc is that you can't initialize it lazily. That means every complex type which have it is forced to have some sort of constructor. This is not a problem by itself, but goes a bit strange with the fact, that every type of the sync package except Locker support and encourage zero value.

It definitely feels weird that you will have both sync.Once which do not need any sort of initialization, and sync.OnceFunc which does.

DmitriyMV commented 1 year ago

Also I don't think that we should encourage global function variables. Those do not look well in docs currently.

The main benefit of the sync.Once is that it explicitly disallows such usages. Sure you can do

var Function = func() func() (int, error) {
    var (
        once   sync.Once
        result int
        err    error
    )

    return func() (int, error) {
        once.Do(func() {
            result, err = doStuff()
        })

        return result, err
    }
}()

But that just looks wrong.

adg commented 1 year ago

@DmitriyMV Just as you wouldn't do this with sync.Once I don't think you would use the proposed new thing to create an exported function variable. That's not what this is for. Yes, it could be abused, but so can many things.

adg commented 1 year ago

@rsc

After much discussion, it sounds like people are generally happy with..

I'm not too fond of the 2 suffix on aesthetic grounds; generally if there's a word for a plural I'll chose it over a number suffix. But I'll go with the whatever the general conesnus is here.

komuw commented 1 year ago

There's too much bike shedding going on, which to me is a sign that we should probably put this in x/exp/ for a release cycle or two.

seebs commented 1 year ago

I sort of feel like the right solution to "should we have an error return or not" might be "it would be nice if it were agnostic about the number of returns, not just the type of one specific return". So, ignoring that there's no such syntax right now, something like

func OnceFunc[T ...any](fn func() T) func() T

where T is not restricted to being one type, but could be any sequence of types, so you could call this with a func() (int, error) and get back a func() (int, error), etcetera.

That is almost certainly way too large a change, but I do think that it provides a much more ergonomic interface than OnceFunc, OnceValue, OnceValues, etcetera, and it lets us leave the question of whether we want an error return to individual call sites.

zephyrtronium commented 1 year ago

@seebs That is #56462.

ianlancetaylor commented 1 year ago

@komuw If there is bikeshedding about the functionality, then x/exp is appropriate. But if we are just discussing the name, we don't need to go through x/exp. People will adapt to whatever name is chosen. We should of course try to choose the best name, but getting a better name is not something that x/exp will help with.

rsc commented 1 year ago

Given that there's no clear way to add variadics to generics to help here, it seems like we should move forward with the non-variadic forms. @adg said he prefers OnceValues for the 2-element form, and we can always add OnceValues3 if it comes to that. So it sounds like we have converged on:

func OnceFunc(f func()) func()
func OnceValue[T any](f func() T) func() T
func OnceValues[T1, T2 any](f func() (T1, T2)) func() (T1, T2)
rsc commented 1 year ago

Based on the discussion above, this proposal seems like a likely accept. — rsc for the proposal review group

icholy commented 1 year ago

I think @DmitriyMV made a very valid point about these apis being unusable for types which have useful zero values.

ianlancetaylor commented 1 year ago

The point is true, but the types also seem clearly useful in practice. Is there an alternative that permits a valid zero value? I don't think that that lack in itself is sufficient to reject the proposal, given that it has clear valid uses.

cespare commented 1 year ago

@ianlancetaylor yes: the non-func-based alternatives which have been the main point of contention of this whole discussion.

ianlancetaylor commented 1 year ago

I think there have been solid arguments against those alternatives, showing that they are less usable in practice, and add less to the existing sync.Once.

earthboundkid commented 1 year ago

I think @DmitriyMV made a very valid point about these apis being unusable for types which have useful zero values.

In those cases, sync.Once will still exist and be just as useable as before.

bcmills commented 1 year ago

It seems like kind of a shame to define three different identifiers for what is essentially just one function.

Type sets with union elements seem like a cleaner API to me: https://go.dev/play/p/NJmK7wdeFUN

Unfortunately, that approach is currently a bit awkward because of some limitation in our type inference algorithm. If I try to use the union form without an explicit instantiation (https://go.dev/play/p/Bhnjc288shH), I get a compile error with cannot infer T.

ianlancetaylor commented 1 year ago

Yes, there is no current type inference rule that would permit the compiler to infer the type argument for T in that example. I don't know that I would describe fixing that as a minor improvement; I don't see how to do it without considerable power to type inference, which so far we're trying to avoid.

bcmills commented 1 year ago

I don't see how to do it without considerable power to type inference,

In this case I don't see how to do it either, because I've never really understood the “core type” mechanism in the current edition of the spec.

However, with a conventional approach to type inference it would be a fairly trivial part of the “constraint type inference” phase: for each union element in the constraint type, for each type element within the union, attempt to unify the argument type with the type element. If all elements of a union that succeed result in a common set of type-parameter substitutions, then the union results in those substitutions.

(In this particular case, “all elements of a union that succeed” is guaranteed to be at most one element, since the element types are mutually exclusive.)

ianlancetaylor commented 1 year ago

A "core type" is basically just an interface with a single (non-interface) embedded type. That type is the core type.

I don't see how we can assume that only one element from a union will match. The union could be map[int]T | map[T]string in which case map[int]string will match twice. But in any case this is not the issue this discuss possible approaches.

bcmills commented 1 year ago

in any case this is not the issue this discuss possible approaches.

That's fair, but the part that is relevant to this issue is: I think it is premature to define multiple OnceFunc variations before we have adequately evaluated the solutions that rely on stronger type inference, and I don't think we have done so yet.

adg commented 1 year ago

@bcmills that's certainly an interesting proposal. I like that there's only one exported function. Even having to specify the type twice, it's not so awkward. But having deeper type inference would make it more ergonomic.

However, it does have downsides:

So I'm not convinced that, even with stronger type inference, the Onceable approach is objectively better.

hherman1 commented 1 year ago

I have reservations about the onceable solution, it seems a bit harsh on the eyes. But I don’t think this point:

the two-value form requires either an error or bool return value, whereas the proposed OnceValues can be any arbitrary return type.

is very compelling if we don’t have specific examples of other widespread use cases here.

bcmills commented 1 year ago

@adg

the two-value form requires either an error or bool return value, whereas the proposed OnceValues can be any arbitrary return type.

That is true, although the same can be said for needing to return any arbitrary number of results. If func() (T1, T2) is important for any arbitrary T2, why is func() (T1, T2, T3) not important?

Two observations:

  1. The vast majority of sync.Once uses initialize one value, and if there are more values involved those can be packed into one struct. So one type parameter is clearly needed for an ergonomic API, but it isn't obvious that having more than one really makes that much of a difference.
  2. The (T, bool) and (T, error) signatures are very well represented in the standard library, so including those provides a clear ergonomic benefit.

it's still two exported identifiers (the OnceFunc function and Onceable interface type), and this only replaces the proposed OnceValue and OnceValues, so we're at a net zero added/removed exported names.

The Onceable interface type is not needed: the constraint can be expressed inline. (https://go.dev/play/p/eyhjkB0MC_d) So that's a net reduction of one exported name, at least.

Moreover, note that the type T doesn't actually participate in the signature of OnceFunc beyond giving structure to the type parameter F. If we further improved type constraints to allow existential types (a well-understood feature in type theory), then we could actually unify OnceFunc and OnceValue, and additionally scale OnceFunc out to arbitrarily many return values.

bcmills commented 1 year ago

Actually, come to think of it, maybe that's the better solution here today too: give up a little bit of compile-time type safety for a huge ergonomic improvement, and use reflect under the hood to handle arbitrary function return types (https://go.dev/play/p/PxdlRBkKreO)!

// OnceFunc returns a function that executes f only the first time it is called,
// returning the results of the first call for each subsequent call.
//
// F must be a function type that accepts no arguments.
func OnceFunc[F any](f F) F {
    …
}
bcmills commented 1 year ago

I guess the obvious problem with using reflect under the hood is that sync cannot import reflect. 😅

(But that's arguably an implementation detail, independent of the function signature!)

adg commented 1 year ago

@bcmills I quite like the func OnceFunc[F any](f F) F function signature. Very cool and neat. It is a shame to lose the compile-time argument check, but that's not such a big deal to me. However I implemented it and unfortunately the reflection-based approach is orders of magnitude slower:

$ go test -bench=.
goos: darwin
goarch: arm64
pkg: github.com/adg/sync
BenchmarkOnceFunc/Func-10       541174450            2.036 ns/op
BenchmarkOnceFunc/Value-10      584306736            2.047 ns/op
BenchmarkOnceFunc/Values-10     318922713            3.729 ns/op
BenchmarkOnceFuncReflect/Func-10            12632359            94.37 ns/op
BenchmarkOnceFuncReflect/Value-10           11179082           106.3 ns/op
BenchmarkOnceFuncReflect/Values-10           9399732           127.5 ns/op

Maybe this isn't a dealbreaker? Not sure. With the reflection-based approach we'd at least need to call it out in the docs and suggest that people who want to call these in tight loops use the original sync.Once instead.

adg commented 1 year ago

That is true, although the same can be said for needing to return any arbitrary number of results. If func() (T1, T2) is important for any arbitrary T2, why is func() (T1, T2, T3) not important?

In both cases it is a line-drawing exercise. In once case we draw the line at 2 return values. In the other case we draw the line at restricting the second return value type to error/bool. It maybe that error/bool is all we need, but we must decide that, so that's why I listed it there.

The Onceable interface type is not needed: the constraint can be expressed inline.

This is true. Rather than counting exported names, the spirit of my comment is more this: when we compare these API surfaces:

func OnceValue[T any](f func() T) func() T
func OnceValues[T1, T2 any](f func() (T1, T2)) func() (T1, T2)

vs

type Onceable[T any] interface {
    func() T | func() (T, bool) | func() (T, error)
}

func OnceFunc[T any, F Onceable[T]](f F) F {

or

func OnceFunc[T any, F func() T | func() (T, bool) | func() (T, error)](f F) F

with all the rest being equal, which is the clearer API? Doesn't seem like a clear win to me either way.

CAFxX commented 1 year ago

Since we're discussing the addition of new APIs, any chance to consider solving the issue of passed functions being kept alive after the once result has been resolved (so I guess this would be an argument in favor of the closure approach).

rsc commented 1 year ago

It doesn't seem like it makes sense to hold up this API for potential future generics changes. It's also worth pointing out that the generic version doesn't handle OnceFunc (which takes a func() with no return value), and it can't be easily added because there would be a dangling unused T.

If we figure out how to make a completely generic OnceFunc we can do that as part of sync/v2 along with a typed generic sync.Map. :-)

rsc commented 1 year ago

No change in consensus, so accepted. 🎉 This issue now tracks the work of implementing the proposal. — rsc for the proposal review group

cespare commented 1 year ago

Just to be clear, what has been accepted is the three additions (OnceFunc, OnceValue, OnceValues) mentioned here, correct? Not the other ideas mentioned after that in this thread?

andig commented 1 year ago

Would the proposed OnceFunc syntax allow a function that returns no value at all? That would allow to skip the separate declaration of a once var in cases where you're only interested in doing things once but not in the result.

cespare commented 1 year ago

@andig sounds like you're referring to one of @bcmills's ideas, whereas the actual proposal that has been accepted (as best I can tell) is this one:

func OnceFunc(f func()) func()
func OnceValue[T any](f func() T) func() T
func OnceValues[T1, T2 any](f func() (T1, T2)) func() (T1, T2)

(But it would be helpful if @rsc could clarify.)

ianlancetaylor commented 1 year ago

@cespare Yes: the three non-generic functions were accepted.

changkun commented 1 year ago
func OnceValue[T any](f func() T) func() T
func OnceValues[T1, T2 any](f func() (T1, T2)) func() (T1, T2)

It is a bit surprising that these two versions of function are accepted. Are we going to get this in?

ianlancetaylor commented 1 year ago

@changkun When you say that it is surprising, do you disagree with the discussion above? Which part is surprising? Thanks.

changkun commented 1 year ago

Indeed. The "surprising" is a nicer wording to disagree with this design decision. It is unclear why this proposal is motivated enough to implement in "sync" instead of "golang.org/x/sync" because 1) it is relatively simple to implement; 2) the API design is not optimal (one purpose three functions), OnceValue and OnceValues are partially redundant with OnceFunc; I would favor this design https://github.com/golang/go/issues/56102#issuecomment-1311857716 (also not perfect, ignoring performance discussion) compare to the accepted one as the original motivation of this proposal was to add a single function to the package. The reason we derived three in the discussion was mainly caused by language constraints. 3) everything already in "golang.org/x/sync" should also deserve a chance to get in if this proposal is in.

ianlancetaylor commented 1 year ago
  1. Yes, but as @rsc noted in https://github.com/golang/go/issues/56102#issuecomment-1293818668 there are real cases where this would be easier to use and understand than the current sync.Once approach.
  2. The generic version that @bcmills suggested is good but it doesn't work usefully today, there is no proposal or plan or even thought about how it might work, and it doesn't and never will support the case of a function that doesn't return any value.
  3. I encourage you to write proposals for anything in x/sync that you should think should be promoted to the standard library.
icholy commented 1 year ago

The generic version that @bcmills suggested is good but it doesn't work usefully today, there is no proposal or plan or even thought about how it might work, and it doesn't and never will support the case of a function that doesn't return any value.

@ianlancetaylor I believe @bcmills linked a working implementation which supports functions that don't return any values: https://go.dev/play/p/PxdlRBkKreO

cespare commented 1 year ago

@ianlancetaylor to clarify some of what you wrote:

@cespare Yes: the three non-generic functions were accepted.

I believe you are referring to the three functions in this comment, but two of them use type parameters, so it's a little confusing to call them "non-generic". (I guess they're less generic than the ideas that came after.)

2. The generic version that @bcmills suggested is good but it doesn't work usefully today, there is no proposal or plan or even thought about how it might work, and it doesn't and never will support the case of a function that doesn't return any value.

I think you must be referring to the idea that @bcmills was floating in this comment which requires more powerful type parameters than the language has. But @bcmills also had a reflect-based proposal, which is the one that @changkun linked to, which does work today and supports the case of a function that doesn't return any value. (I think its main shortcomings are lack of type safety and slowness of implementation.)

ianlancetaylor commented 1 year ago

I don't think that we are going to use the reflect version. Sorry for not mentioning it. Also sorry for confusion about using "generic".

Are you all saying that we should not accept this proposal after all? Why is this just coming up now?

cespare commented 1 year ago

Are you all saying that we should not accept this proposal after all?

I'm not saying that (though it sounds like @changkun is). After losing the func debate early on, my second-favorite proposal is the one that @rsc said is accepted (i.e., the three-function, not-reflect-based version). I'd like to see it land ASAP.

I just found the conversation (specifically the messages today from @rsc and @ianlancetaylor) to be quite confusing, since they seemed to be ignoring the most recent idea (@bcmills's reflect-based one) which had been the current topic of discussion when @rsc marked the proposal as accepted.

aclements commented 1 year ago

I may have missed it, but I think we haven't said what happens if the function f panics. I can think of three options, but only the first actually seems reasonable:

  1. If f panics, OnceFunc, etc, stores the panic value and the result function panics with that value every time it is called. This seems like the only reasonable option.
  2. If f panics, OnceFunc, etc, don't store the result at all, but call f again if the result function is called again. That seems inconsistent with the intent of OnceFunc, inconsistent with Once.Do, and also impossible to implement on top of Once.Do because it will only invoke the function once, even if it panics.
  3. If f panics, the first call to the result function panics, and subsequent calls return the zero value. This is the easiest to implement, but seems inconsistent with the intent of OnceFunc.
bcmills commented 1 year ago

@aclements, there is also a fourth option:

  1. If f panics, the first call to the result function panics, and subsequent calls hang forever (because the first call has not returned).

That might give a somewhat better debugging experience, because the value from the first panic does not need to be recover'd, and thus cannot corrupt the stack trace in the common case that the panic is not recovered further up.

However, it also increases the risk of deadlock if the panic is swallowed (such as by a fmt call or net/http handler).

aclements commented 1 year ago

@bcmills, that's a good point about the stack trace. We could arrange that the first panic has the full stack trace by immediately re-panicking from within the recovering defer. You would get only a partial stack trace on subsequent calls. That feels like a pretty reasonable compromise to me that doesn't have the danger of deadlock.

rsc commented 1 year ago

It seems like the answer should be "do what sync.Once.Do does", although I don't know what that is. Edit: I see, it just doesn't call f again. That's more problematic when there are values involved. I agree that #1 is the right option.

rsc commented 1 year ago

I've retitled the proposal to indicate what was accepted. I apologize for not doing that yesterday.

gopherbot commented 1 year ago

Change https://go.dev/cl/451356 mentions this issue: sync: implement OnceFunc, OnceValue, and OnceValues

aclements commented 1 year ago

I went ahead and prototyped this, including the panic handling we were just discussing. It seems to work quite nicely!

It did reveal some interesting limitations of inlining, partly related to the panic handling. This results in ~7x worse performance than sync.Once, though sync.Once has essentially optimal performance, so it's a high bar to hit. I've detailed the issues in my commit message. I don't think this affects the proposal at all, but we might want to look into fixing these issues in the inliner. I've been thinking anyway that the inliner could use some focus for 1.21 as part of ongoing PGO work.

ianlancetaylor commented 1 year ago

I tend to agree with @bcmills that we don't need any special panic handling here. If the function panics, it panics. We shouldn't try to preserve the panic for later calls, we should deadlock.

To put it another way: I don't think there is any real use case that will be served by arranging to panic on each call to sync.OnceFunc. So I don't think we should go out of our way to implement that.