Closed earthboundkid closed 1 year ago
Duplicate of #14423
@seankhliao, good find, but that issue was frozen before the modern proposal process existed. Either that issue should be reopened and put through the proposal process or this one should be unclosed, but I don't think it's fair to call it a duplicate when the old one was never actually evaluated.
The idea was clearly evaluated in the previous issue and declined. The decisions we made before the proposal process are just as valid.
I don't think it's fair to call 5 comments "clearly evaluated." The reception was mixed. Abbgrade was for it. Minux was against it. Bradfitz was neutral to positive on the idea if there was more data.
It ends with @griesemer saying,
This is not an issue, this is a feature request. Please discuss this first on one of the popular Go forums (mailing list, etc.).
I don't think he would have said "go discuss it somewhere else" if that discussion was precluded from having an effect because the issue was permanently closed once and for all. I think the idea was "go discuss it more and if it comes up again we can take another look." Now we have a formal process, so it's time to take a look. :-)
It was quite clear it doesn't belong in strings, and the natural place for it be in now, slices, also has the similar idea being declined in #52006
It doesn't work in slices because it would need a T comparable, which is confusing, or to be a find func, which as you note was already declined. Just because it could be a generic doesn't mean it should be. :-) I've been using my personal stringutils.First for years and for me it's above the bar to get it into the standard library. Maybe I'm wrong, but I think it's worth having a discussion.
I agree that the earlier issue didn't get a full proposal review. We can do it again.
That said, finding some more examples would help justify adding this.
And, in general the strings and bytes package are parallel. What would this look like in bytes, and would anybody use that variant?
That said, finding some more examples would help justify adding this.
I've been using a version of First for at least three years that I can recall, and I'm up to 19 uses in a 13,000 line project. It's pretty routinely useful for me. (It might go back further, and I've just forgotten the history of it.)
Going back to the examples from archive/tar above, I think there's a readability gain in hdr.Name = strings.First(gnuLongName, hdr.Name)
and name = strings.First(realName, "GlobalHead.0.0")
, because you can tell quickly tell what the preferred value is and what's the fallback default, whereas in the old code the first example was set with if gnuLongName != ""
and the second was set with if realName == ""
.
What would this look like in bytes, and would anybody use that variant?
I suppose it should be First(...[]byte) []byte
, but I agree that it is unlikely to be used much, since the main use is to set a default for a string value.
I’m doing some very basic searching on SourceGraph to find versions of this in the wild.
"profile"
for blank.default
template helper that is an any interface version of this""
into <none>
"FAIL"
for blankOkay, that’s as much looking at search results as I feel like doing now. If anyone can do a more semantic search over a larger corpus, I would be interested to see the results. One thing that surprised me was how often a repo would have multiple versions of it. Also the env var default thing comes up a lot.
Edit: Couldn't help myself, and I found another one in Istio 😆 Gotta force myself to close the tab before I go crazy.
That's great data, thanks.
This can be written for any comparable type using generics today
func First[T comparable](vs ...T) T {
var zero T
for _, v := range vs {
if v != zero {
return v
}
}
return zero
}
The type constraint could be loosened to any
if #26842 gets accepted.
While often for strings, I've written similar for all kinds of types, though I don't think I've ever needed anything other than 2 values at a time.
It's quite common in dealing with configuration where the zero value models an absence to be replaced by a default.
I’ve had a toy repo with generic First for several years, but I’ve found that in practice I only ever use strings.
As for varadic vs a pair, most instances are just pairs, but I think the Go optimizer now optimizes the slice away, so you may as well have a variadic version for the occasional times when you need more than two.
It's quite common in dealing with configuration where the zero value models an absence to be replaced by a default.
Neither comparable
nor any
tightly constrain to types where var zero T
is a robust sentinel value for inferring absence. I think this is a problem for a generic First
, it's not foolproof enough.
This is clearly a useful operation, perhaps even useful enough to have in the standard library.
But is First the right name? Is it the name used anywhere else with this meaning?
If I saw strings.First(x, y, z) I'd probably expect that it returned x (and wonder what the point was).
In text/template (and also in Lisp and Scheme, where I took it from), the name for this operation is or
.
It's also similar to the min/max builtins proposed in #59488 except that the item is selected in a less mathematical and more Go-specific way
Even if it's mostly used for strings, it really feels not string-related to me and not a good fit for the strings
package.
However, I think I would use it quite a bit for not-strings if it existed. There's a certain kind of operation I write regularly in Go which would be written using a ternary in another language. Something like (this is grabbed from some real code):
port := h.GRPCPort
if port == 0 {
port = 8500
}
With a ternary expression, you might write something like
port := h.GRPCPort == 0 ? 8500 : h.GRPCPort
With a slices-based function, you could do
port := slices.Or(h.GRPCPort, 8500)
I think or
works well in Lisp and text/template but slices.Or
seems a bit mysterious. But maybe it could work.
A longer, but more self-evident name is slices.FirstNonZero
.
This proposal has been added to the active column of the proposals project and will now be reviewed at the weekly proposal review meetings. — rsc for the proposal review group
However, I think I would use it quite a bit for not-strings if it existed. There's a certain kind of operation I write regularly in Go which would be written using a ternary in another language. Something like (this is grabbed from some real code):
port := h.GRPCPort if port == 0 { port = 8500 }
That's #37165, which also uses a default port as an example. :-)
slices.FirstNonZero
would presumably take an actual slice instead of a variadic argument, which is less ergonomic.
I'm fine with the name strings.FirstNonZero
though.
I don't think this is necessarily a great idea, but just to consider it, you could have package bools with Or[T comparable](...T) T
and Cond[T any](cond bool, ifVal, elseVal T) T
. I think that having Cond
probably changes the feel of the language too much though because you'd be using a function call in a lot of places that use x = a; if cond { x = b}
now.
Even if it's mostly used for strings, it really feels not string-related to me and not a good fit for the strings package.
I guess the follow-up is, what package would it fit in? Looking through https://github.com/golang/go/issues/60204#issuecomment-1550320945 I think it is pretty surprising that this is so frequently about environment variables (or at least configuration variables with similar usage patterns).
I wonder if there might be a more Glasgow (not New Jersey)-style package for aggregating configuration from program constants, env variables, json/yaml/toml etc., and I think this functionality would probably be natural in that style.
OTOH, applying the New Jersey philosophy, maybe it's not objectionable that people are re-implementing this functionality ad-hoc - it doesn't seem error-prone, and people can be as narrow or as abstract as they want.
One place where comparable
is insufficient for this use case is callbacks.
I've written a lot of code like this:
func New(cfg *Cfg) *Thing {
t := &Thing{
foo: cfg.Foo
}
if t.foo == nil {
t.foo == fooDefault
}
return t
}
If there were an or
that handled zero-comparable types (either a builtin or language change that allows it to be written with generics) that would just be:
func New(cfg *Cfg) *Thing {
return &Thing{
foo: or(cfg.Foo, fooDefault),
}
}
A more general scenario. Use Nth
-like instead First
:
func NthOrZero[T any](elems []T, n int) T {
if n <= 0 || n >= len(elems) {
var empty T
return empty
}
return elems[n]
}
and for ok check:
func Nth[T any](elems []T, n int) (T, bool) {
if n <= 0 || n >= len(elems) {
var empty T
return empty, false
}
return elems[n], true
}
or if we has Maybe
or Optional
type:
func Nth[T any](elems []T, n int) Optional[T] {
if n <= 0 || n >= len(elems) {
return None()
}
return Some(elems[n])
}
But is First the right name? Is it the name used anywhere else with this meaning? […]
In text/template (and also in Lisp and Scheme, where I took it from), the name for this operation is
or
.
I'm reminded of the SQL function COALESCE.
Returns the first non-NULL value in the list, or NULL if there are no non-NULL values. At least one parameter must be passed.
I've found First
is quite a common operation beyond strings. In particular, selecting the first error between operation(s) and cleanup (if any). I have used n-ary versions for different types (errors, strings), but 2 is most common. Similar issue for default fallback for ints and other types.
Writing this now, I'd use a generic version similar to @jimmyfrasche 's example above.
I use errors.Join
for that purpose. It’s a little different because it returns a multierror when necessary, but in the basic case you can treat it like “first error or nil”.
I use
errors.Join
for that purpose.
Typically the situations I encounter result in any errors beyond the First being irrelevant or a distraction - hence I don't use Join. Depending on the circumstance, either could be appropriate.
Throwing a vote in for @ianlancetaylor's Default. Possibly a variadic version if you want but preferably not.
Especially because of having errors.Join, or ignoring secondary errors as previous reply, I think Default with two arguments is a fine name and signature. Thinking of 90% of use cases.
Occasionally I do want the first non-nil, or first non-zero, value of a variadic list of inputs, but far less often, and if I did I'd want it called FirstNonZero or FirstNonNil, even if it did the exact same thing as Default, as it better expresses intent, especially if the former were variadic and the latter was only Default(a, b)
.
Galaxy brain: if I really want a variadic version I'd call some generic Reduce function and pass Default as the reducer function.
I'm liking slices.Coalesce[T any](...T) T
.
slices.FirstNonZero
would presumably take an actual slice instead of a variadic argument, which is less ergonomic.
I think you really want this function to take a variadic argument, though I guess that would make it slightly unusual among the other members of slices
.
@cespare without #26842 it would have to be slices.Coalesce[T comparable](...T) T
so you couldn't use it for functions
On the strings vs. generics question, we have evidence from code search that people are writing and using the strings version of this. The generic version might also be worth having around, but we don't really have evidence for that yet. I also think people aren't necessarily going to think to look in slices for a "first non zero" function or whatever it's called. To me the preferable thing would be to just add strings.Coalesce for now and circle back to the broader question later, maybe after #26842.
In Java this was called FirstNonNull()
and that is semantically useful name. First()
is a terrible name because the semantic is incorrect. To be semantically informative and fullfil the least surprise, it would be called FirstNonEmpty()
if it was for string
only. FirstNonZeroValue()
for comparable
but that gets suspect of its usefulness and FirstNonNil()
for pointer
types. Like others have said, a generic version is fraught with subtle issues as well.
This is trivial code that does not need to be cluttering up the standard library. There are lots of other more useful things to spend time on like proper Enums or even a proper set
slice implementation that enforced uniqueness.
This is trivial code that does not need to be cluttering up the standard library.
I think that people use the strings version of this often enough that it meets the bar for just having in the standard library instead of having three or however many copies in Istio.
The generic version is a harder sell because it doesn’t really fit into slices and it definitely doesn’t deserve its own package.
it would certainly be nice to do something here.
strings.Coalesce
would be handy in some situations but you'd still need it for other types (though probably not the implied bytes.Coalesce
).
An operator, like ??
in the related #37165, would cover all the cases but adding an operator is a large change.
slices.Coalesce
written today would be limited to comparable
types and func
is one of the times where this comes up. With #26842, it could be written in a fully general manner but it would indeed be an odd one out in slices
.
A coalesce
builtin, defined like the new min
/max
, wouldn't need to worry about fitting in any package, would have a lower bar to clear than ??
(though still a high bar), and could be fully generic without having to wait on other language changes.
Just a note that if we add an operator, it seems to me that the operator should be ||
. Which we already have. Not that I'm arguing in favor of an operator, I just think that if we go down the operator path there's no reason to introduce a new one. The meaning here is just a slight extension to what ||
already does.
Let's see how it feels to call it cmp.Or. Right now we have code like this in the go command:
GO386 = envOr("GO386", buildcfg.GO386)
GOAMD64 = envOr("GOAMD64", fmt.Sprintf("%s%d", "v", buildcfg.GOAMD64))
GOMIPS = envOr("GOMIPS", buildcfg.GOMIPS)
These would become:
GO386 = cmp.Or(Getenv("GO386"), buildcfg.GO386)
GOAMD64 = cmp.Or(Getenv("GOAMD64"), fmt.Sprintf("%s%d", "v", buildcfg.GOAMD64))
GOMIPS = cmp.Or(Getenv("GOMIPS"), buildcfg.GOMIPS)
The signature would be
// Or returns the first non-zero element of list, or else returns the zero T.
func Or[T comparable](list ...T) T
It's worth noting that the operation is called "or" in Lisp, Python, and many other languages, and conceptually it is returning one or the other of these values. cmp.Or() won't type-check but cmp.Or[Foo]() returns a zero Foo.
Thoughts?
The limitation to comparable
is too unfortunate. I'd be fine with it in cmp if it worked over non-comparable types as well.
The Getenv case seems like it's more common than others. Maybe there should be an os.GetenvOr
even in the face of any other changes?
@jimmyfrasche With what the language supports today I don't see a way to write a generic version that supports slices, maps, functions, or channels. Do you have any thoughts on how that could work? We could add maps.Or
and slices.Or
if that seems useful.
cmp.Or
works for me.
I have had a version of it that uses reflection in my toolbox for a while, so it can skip zero length slices and maps, but it's much slower than a normal comparison and I can never bring myself to it.
Proposal renamed to be about cmp.Or.
The major use for non-comparable types is funcs which can't really be handled generically.
If there's a runtime "is this all bits zero" predicate you can hook into to work around the lack of a general way to test zero-ness that would be fine by me since this function is 90% of the reason I'd want such a thing. If that's the route, exposing it as a cmp.Zero[T any](v T) bool
would get the other 10%.
Zero-coalescing for functions seems to be widely applicable in net/http* packages and crypto/tls, which have a variety of configurable functions with default behaviors. That said, my own primary use would be for maps, especially maps of maps, where it would simplify creating the map for the first insert. I can think of three times I've done this in the last month, though unfortunately not in public code.
comparable
limitation, and short-circuiting is uncommonly but still occasionally useful. Given a zero-coalescing operator, I can't think of any reason to use cmp.Or. Is choosing the latter a reason to reject the former?Here's some actual code I came across today
if event.Location.City != "" {
p["city"] = event.Location.City
} else {
p["city"] = nil
}
where p
has type map[string]any
.
By using the ||
operator, it could have been written as
p["city"] = event.Location.City || nil
@gazerro that would not work with any of the proposals as the types do not match.
@jimmyfrasche the ||
operator has been proposed but its semantic have not been explicitly defined. It has certainly been implied that the operands should have the same type, and the expression has the type of the operands.
However, to allow for the case where an expression with the ||
operator is assigned to a value of type any
and passed as an argument to a parameter of type any
, as in:
var x any = a || b
only in this case, it could be specified that the types of a
and b
may not be the same, but they must be assignable to the type of x
.
@gazerro that would be very different from how other binary operators—including today's ||
—work. https://go.dev/play/p/zXDM4a0KYsl
@jimmyfrasche absolutely, but there are many special cases in the spec. I think it's just a matter of considering whether it's worth it for this particular use case
At that point, aren't you really asking for a ternary conditional operator? This is just a restricted form where you want to retain part of the predicate in the consequent case.
@ianlancetaylor There are at least two ways to write a generic is-zero predicate in the language currently.
The simple way is reflect.ValueOf(v).IsZero()
The less simple way is
func Zero[T any](v T) bool {
bp := (*byte)(unsafe.Pointer(&v))
sz := unsafe.Sizeof(v)
for _, v := range unsafe.Slice(bp, sz) {
if v != 0 {
return false
}
}
return true
}
I did not benchmark but I'm sure that's faster than reflect and could be made faster still by special casing common sizes, preferring to check a word at a time, etc. And a magic runtime function with custom assembly per arch and treating it as a compiler intrinsic would go even further, surely.
Exporting Zero
in cmp
or elsewhere is perhaps a discussion for another thread but I think it's feasible to use it to implement the more general cmp.Or[T any]
.
An extremely common string operation is testing if a string is blank and if so replacing it with a default value. I propose adding
First(...strings) string
to package strings (and probably an equivalent to bytes for parity, although it is less useful).Here are three example simplifications from just archive/tar because it shows up first alphabetically when I searched the standard library:
archive/tar diff
```diff diff --git a/src/archive/tar/reader.go b/src/archive/tar/reader.go index cfa50446ed..bc3489227f 100644 --- a/src/archive/tar/reader.go +++ b/src/archive/tar/reader.go @@ -136,12 +136,8 @@ func (tr *Reader) next() (*Header, error) { if err := mergePAX(hdr, paxHdrs); err != nil { return nil, err } - if gnuLongName != "" { - hdr.Name = gnuLongName - } - if gnuLongLink != "" { - hdr.Linkname = gnuLongLink - } + hdr.Name = strings.First(gnuLongName, hdr.Name) + hdr.Linkname = strings.First(gnuLongLink, hdr.Linkname) if hdr.Typeflag == TypeRegA { if strings.HasSuffix(hdr.Name, "/") { hdr.Typeflag = TypeDir // Legacy archives use trailing slash for directories @@ -235,13 +231,8 @@ func (tr *Reader) readGNUSparsePAXHeaders(hdr *Header) (sparseDatas, error) { hdr.Format.mayOnlyBe(FormatPAX) // Update hdr from GNU sparse PAX headers. - if name := hdr.PAXRecords[paxGNUSparseName]; name != "" { - hdr.Name = name - } - size := hdr.PAXRecords[paxGNUSparseSize] - if size == "" { - size = hdr.PAXRecords[paxGNUSparseRealSize] - } + hdr.Name = strings.First(hdr.PAXRecords[paxGNUSparseName], hdr.Name) + size := strings.First(hdr.PAXRecords[paxGNUSparseSize], hdr.PAXRecords[paxGNUSparseRealSize]) if size != "" { n, err := strconv.ParseInt(size, 10, 64) if err != nil { diff --git a/src/archive/tar/writer.go b/src/archive/tar/writer.go index 1c95f0738a..e9c635a02e 100644 --- a/src/archive/tar/writer.go +++ b/src/archive/tar/writer.go @@ -188,10 +188,7 @@ func (tw *Writer) writePAXHeader(hdr *Header, paxHdrs map[string]string) error { var name string var flag byte if isGlobal { - name = realName - if name == "" { - name = "GlobalHead.0.0" - } + name = strings.First(realName, "GlobalHead.0.0") flag = TypeXGlobalHeader } else { dir, file := path.Split(realName) ```