golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
122.88k stars 17.52k forks source link

reflect: back out multiple keys in key:value pair in struct tag #40281

Closed scorsi closed 3 years ago

scorsi commented 4 years ago

Hello πŸ‘‹

I'm actually boring to set the same value for multiple key in my struct tag. Maybe I missed something but here is the boring thing:

type MyStruct struct {
  Field1 string `json:"field_1,omitempty" bson:"field_1,omitempty" xml:"field_1,omitempty" form:"field_1,omitempty" other:"value"`
}

What I'm proposing is:

type MyStruct struct {
  Field1 string `json,bson,xml,form:"field_1,omitempty" other:"value"`
}

Thanks πŸ‘

ianlancetaylor commented 4 years ago

Just noting that this would mean that struct tag keys cannot contain a comma character. The current documentation at https://golang.org/pkg/reflect/#StructTag is "Each key is a non-empty string consisting of non-control characters other than space (U+0020 ' '), quote (U+0022 '"'), and colon (U+003A ':')." So this would be a change. Of course, it's fairly unlikely that anybody is using a tag key with a comma.

scorsi commented 4 years ago

It could be a breaking change of course.

rsc commented 4 years ago

I have a large corpus of Go code that I can scan at some point to see if anyone uses , in tag names in practice. Will try to do that before next week.

rsc commented 4 years ago

What if we use a space-separated list instead of a comma-separated one?

type MyStruct struct {
  Field1 string `json bson xml form:"field_1,omitempty" other:"value"`
}

The spaces form is more analogous to what happens in Go argument lists: in x int, y int, z int, you can remove the repeated int, leaving x, y, z int. If we do the same here, then in json:"x" bson:"x" xml:"x", you can remove the repeated :"x", leaving json bson xml:"x". If we inserted commas during this step, you'd need to explain where they came from (they weren't there before!).

The space form is also quite a bit less of a wall of text and thus easier to skim. Compare:

Field1 string `json bson xml form:"field_1,omitempty" other:"value"`
Field1 string `json,bson,xml,form:"field_1,omitempty" other:"value"`

And unlike comma, spaces cannot appear in valid tags today, so we would not be redefining technically valid syntax to mean something slightly different.

cristaloleg commented 4 years ago

Space-separated is much better and it indeed looks like Go argument list.

Regarding proposal: I haven't use more than 1 file tag (json or yaml) before, but last week have played with different config files and leaving all this verbose repetitions (field1 for JSON, also for YAML, also for TOML, also for ...) will make multi-tags much easier to use.

scorsi commented 4 years ago

I must admit that space separated could be a good idea which won’t break anything.

Personally I feel it less comprehensible than comma separated and in go we have comma between arguments (we don’t write a b c int so comma is more go in fact lol ok i did understand why space is more go in fact) but a choice have to be made and don’t belongs to me :)

Even with space separated, the main issue is resolved so it’s ok to me ! ;)

rsc commented 4 years ago

OK, so with the syntax I suggested last Thursday (space-separated instead of comma-separated keys), there is no concern about backwards compatibility. There hasn't been much discussion otherwise.

I did make some progress analyzing tags in my Go corpus last week, and I found that 0.06% of tags have commas in keys (outside quoted values). All of those are mistakes. So we could plausibly use comma and not break things, but the space is more in keeping with the current syntax anyway.

Based on the discussion above, this seems like a likely accept (with spaces, not commas).

To be clear, the new syntax would be:

type MyStruct struct {
  Field1 string `json bson xml form:"field_1,omitempty" other:"value"`
}
rsc commented 4 years ago

No change in consensus, so accepted.

bynov commented 4 years ago

Hi! I would like to work on that feature. Is it possible? :)

rsc commented 4 years ago

Sure, go ahead.

gopherbot commented 4 years ago

Change https://golang.org/cl/248341 mentions this issue: reflect: added multiple keys support for struct tags

deven96 commented 4 years ago

Hello @bynov . Are you still working on this?

bynov commented 4 years ago

Hi, @deven96 I have already sent PR on Gerrit. It's in review now :)

deven96 commented 4 years ago

Lol... darn it Goodluck mate @bynov

scorsi commented 3 years ago

Hello, into which version of go this feature will be available ?

cristaloleg commented 3 years ago

@scorsi probably Feb 2021, when Go 1.16 will be released.

gopherbot commented 3 years ago

Change https://golang.org/cl/274448 mentions this issue: reflect: document multiple keys in struct tags

fatih commented 3 years ago

I'm trying to add this new syntax to one of my tools gomodifytags which is used in many editors (i.e: VSCode, Vim, etc..). It allows one to add or remove struct tags in an easy way. You can find more details here: https://github.com/fatih/gomodifytags

One particular edge case I've found is, assuming the user wants to add a new tag to an existing field for the same value.

type T struct {
    Server string `json:"server"`
}

Adding the new key in the old format would produce:

type T struct {
    Server string `json:"server" xml:"server"`
}

But in the new format, the tool need to output it in:

type T struct {
    Server string `json xml:"server"`
}

So the problem I see here is, how should I, as a tool owner know which format is the correct one for the user? Outputting in the new format would mean that we should assume the application is built with the latest (i.e: 1.16) Go version. But if they run it in an older version, the new format won't work and it'll break their applications.

One can say that outputting in the old format should work and hence the tool always should output in the old format for backwards compatibility. If I choose this approach, then some other use cases arise. Assume the user already uses the new format:

type T struct {
    Server string `json xml:"server"`
}

If the user adds a new tag, because the tool outputs in the old format, suddenly it changes to something that the user didn't intended to use:

type T struct {
    Server string `json:"server" xml:"server" bson:"server"`
}

I think this is the only approach I can use for now. But then, I wonder how I can make it easier for the user to use this new syntax. I feel like there is a compromise I need to consider and whether this change would cause disruptions in the future because not a lot of services might run with the old versions. It looks like a ticking bomb waiting to explode and I hope we can reconsider the syntax due the reasons listed here.

uluyol commented 3 years ago

@fatih Isn't this what the go language version in the go.mod file is for? Can't you just use the new style if version β‰₯ 1.16, and otherwise stick with the old one?

marwan-at-work commented 3 years ago

@uluyol I used to think the same thing but I don't believe that's true. A go.mod file declaring 1.16 can still be compiled by 1.15 or lower.

ianlancetaylor commented 3 years ago

If the go.mod files says go 1.16, that means that the files in the module may use constructs introduced in language version 1.16. A person using a Go 1.15 or earlier compiler can still attempt to compile the module. That is not prohibited. But if an error occurs while compiling, the go tool will report "note: module requires Go 1.16".

So in general it is OK for a transformation tool to use Go 1.16 language constructs if go 1.16 appears in the go.mod file.

mvdan commented 3 years ago

I fully agree with @ianlancetaylor; a tool should use the go.mod language version for this decision. In this case it's slightly trickier, as an older version of Go would fail only when using vet/test to parse the struct field tag, and not at compile time, as I'm pretty sure struct field tags are not fully parsed at compile time.

But I still think that is okay. If a module declares Go 1.16 as the language version, and a downstream user insists on using Go 1.14 or 1.15, I argue it's up to them to be extra careful that their software still works even after it builds. A Go program/module building on an older version of Go does not mean that it will work fine. Fully vetting/testing it will the older Go version is the only way to be sure.

mvdan commented 3 years ago

I argue it's up to them to be extra careful that their software still works even after it builds

I should clarify, however, that I think this might end up being a problem with people installing Go programs via go get/install following README instructions. Those people are likely not Go experts, and they might be using the Go version that came with their system, often an older version. If their program install works, they would have no reason to think that the program might not work properly.

That said, I still think that is a problem with Go builds in general; this change just adds another way to end up in this weird spot.

rogpeppe commented 3 years ago

But if an error occurs while compiling, the go tool will report "note: module requires Go 1.16".

This heuristic works well for language changes that break existing syntax, but not so well for changes like this one that change behaviour but don't result in errors when using an old compiler version.

For example, say some third party package changes to use this feature, and I (still using go 1.15) update to use the new version. My code is now broken with no indication of the fact. go vet would pick this up, but it's uncommon for people to run go vet on third-party code (doing so very often spews out a raft of false positives).

When using an older Go compiler, it's common to import modules that declare newer Go versions, because any newly created module declares the current compiler version even if isn't using new language features, so seeing that happen isn't an instant red flag (aside: go list -f '{{.Path}} {{.GoVersion}}' -m all isn't hard to type, but isn't obvious either, if you want to check this).

I worry that in larger ecosystems, this change might end up with significant non-obvious brokenness that only emerges after things break in production.

I don't currently see an easy way around it.

ianlancetaylor commented 3 years ago

This heuristic works well for language changes that break existing syntax, but not so well for changes like this one that change behaviour but don't result in errors when using an old compiler version.

Yes. This is discussed in detail at https://go.googlesource.com/proposal/+/refs/heads/master/design/28221-go2-transitions.md .

This isn't quite a redefinition, as the new code generates a runtime error with pre-1.16 versions of the standard library, but you're right that it's problematic in that it doesn't generate a compile time error.

So I think I agree with you: although as I said above using the new syntax is in some sense permitted for Go 1.16 or later, full safety suggests not using the new syntax until Go 1.15 and earlier are no longer supported.

rogpeppe commented 3 years ago

This isn't quite a redefinition, as the new code generates a runtime error with pre-1.16 versions of the standard library

I don't see a runtime error when running this code under Go 1.15, so it seems like it actually is a redefinition.

ianlancetaylor commented 3 years ago

You're right. I was mistaken.

gopherbot commented 3 years ago

Change https://golang.org/cl/277092 mentions this issue: go/analysis/passes/structtag: recognize multiple keys per tag

gopherbot commented 3 years ago

Change https://golang.org/cl/277076 mentions this issue: cmd/vet: vendor in x/tools, update structtag vet check

leitzler commented 3 years ago

@ianlancetaylor just to sort out any confusion, is the decision here to include this change in Go 1.16 even though it indeed is a redefinition?

As I read https://go.googlesource.com/proposal/+/master/design/28221-go2-transitions.md#language-redefinitions it looks like redefinitions should be avoided:

I think the only feasible safe approach is to not permit language redefinitions.

I do understand that there might be exceptions for critical changes, but do this feature really falls into that category?

rsc commented 3 years ago

To be clear, the redefinition is that this program prints an empty string in older versions of Go (and is rejected by go vet in those earlier versions):

package main

import (
    "fmt"
    "reflect"
)

type T struct {
    Server string `json xml bson:"other"`
}

func main() {
    var t T
    fmt.Printf("json tag: %q\n", reflect.TypeOf(t).Field(0).Tag.Get("json"))
}

And now in a newer version, it will print json tag: "other".

So the unexpected behavior is that if people adopt this syntax when using Go 1.16, but then they publish a library and Go 1.15 users use it, then it won't work the same in the earlier versions, which will be blind to the tag.

We are not redefining existing programs (no existing programs use spaces in tag name syntax and pass go vet). We are however defining a new syntax that will behave silently differently in older versions of Go. The question is whether that's OK.

rsc commented 3 years ago

It seems like there are three paths forward. (1) Kill this feature forever. (2) Keep the feature as is, with the rocky transition. And (3) do a more staged transition like the //go:build lines. That staged transition looks like:

Then if someone adopts the new syntax ahead of their importers adopting a version of Go that understands it, the importers will get a clear failure message instead of silently different behavior. (And when Go 1.17 is out, Go 1.15 and earlier are officially unsupported.)

It does seem like maybe (2) is a bit too subtle for the case where you're depending on someone else's code and they've updated more quickly than you realized (or you've updated more slowly than they realized). So maybe we should do (1) or (3).

Thoughts?

rsc commented 3 years ago

Reopening to discuss transition.

myitcv commented 3 years ago

At least based on the discussion in #tools on Slack, 2 has seemed problematic to many, because it's subtle and "silent".

In terms of 1 vs 3, the question I had is what were the arguments in favour of this in the first place? i.e. is there a large corpus of code that would be demonstrably improved by this change, in such a way as to outweigh the cost of implementation/transition?

rogpeppe commented 3 years ago

I vote for 3, but there's still a problem for 1.15 users if a library upgrades directly from 1.15 to 1.17. It definitely narrows the window though.

leitzler commented 3 years ago

I can't see how this change is worth doing (3) so I would say (1) unless someone brings additional data.

marwan-at-work commented 3 years ago

I'd vote option 1 as well.

I don't believe the cost (breakage in tools, upgrade paths, and teaching newcomers about two different syntaxes inside struct tags) is worth the benefits.

And with tools like Fatih's gomodifytags a lot of this can be automated anyway.

OneOfOne commented 3 years ago

I personally vote for using ",", makes it more explicit and easier to read, otherwise #3.

scorsi commented 3 years ago

Why if instead of space we use coma as I suggested at the beginning ? Does it will fix that transition feature ? I understand the issue but killing that feature forever is a little bit too disproportionate no ? There is always drawbacks but we have to look into benefits and here there are. Just to sum one very useful: duplicating code is bad and error prone, I think we are all agreed, when creating structs for web app communicating in different formats and to different DB, you are duplicating a lot those struct tags : json, bson, xml, csv, pg, ...

leitzler commented 3 years ago

Does it will fix that transition feature ?

Unfortunately not, it would be a slightly worse situation since go vet won't detect the invalid tag in Go <= 1.15.

mdempsky commented 3 years ago

no existing programs use spaces in tag name syntax and pass go vet

FWIW, I argued in #3059 that fixing a cmd/compile type-checking mistake should be safe because the affected programs already failed go vet, and @ianlancetaylor and @randall77 disagreed and argue the fix should be gated by -lang.

In that issue, go/types (and thus go vet) already rejects programs that never use a variable except for assignments, but cmd/compile mistakenly treats variables as used if they're captured by reference by a function literal. Notably, in the case of #3059, the go/types type-checking failure cascades and affects downstream go vet / go test usage too; see https://github.com/golang/go/issues/3059#issuecomment-674334132 for an example where example.com/a exercises the cmd/compile issue, and it causes go vet example.com/b and go test example.com/b to fail.

I think we should be consistent about whether "passes go vet" affects the applicability of Go 1 compatibility.

--

I'll also say I think it's an error that the Go 1.N toolchain will silently accept modules that specify Go 1.(N+1). That precludes us from using -lang to redefine existing syntax except through the whole staged transition process, which just delays rolling out new features to end users.

E.g., if Go 1.15 didn't silently compile packages that declare go 1.16 in their go.mod, then one path forward here would be to make the compiler responsible for expanding `foo bar:"xyzzy"` into `foo:"xyzzy" bar:"xyzzy"` when -lang=go1.16 is specified. (Maybe that's still not desirable for some reason; e.g., how should this affect type identity?) Existing packages that might use spaces would continue to behave the same as always.

But we can't do that in Go 1.16, because the Go 1.15 toolchain will be supported for another 6 months. And even after that, some users are likely to continue using older, unsupported Go toolchains, and will risk silently miscompiling any packages that start using this feature after Go 1.17.

Edit: Apparently there is a note: module requires Go XXX message reported by cmd/go, but it's only printed if cmd/compile fails or prints any output.

ktye commented 3 years ago

colon instead of space would also work: `json:bson:"abc"` like in a chained assignment.

rsc commented 3 years ago

I'll also say I think it's an error that the Go 1.N toolchain will silently accept modules that specify Go 1.(N+1). That precludes us from using -lang to redefine existing syntax except through the whole staged transition process, which just delays rolling out new features to end users.

Noted, and I agreed with your position at first, but @ianlancetaylor convinced me otherwise when we made the decision.

Because we make the go version get set automatically during "go mod init" (which is good!), many packages overstate their strict Go version requirements, and unlike normal package requirements, we can't just push the build to the newer version of Go automatically. Rejecting Go 1.(N+1) during a Go 1.N build would needlessly break packages that end up with indirect dependencies with overstated Go 1.(N+1) requirements. This is a pretty likely situation and it would harm the ecosystem.

The way to avoid the eager rejections is to commit that we will not silently change the meaning of an existing program. We can redefine existing building programs to non-building programs, which won't affect older versions. And we can redefine existing non-building programs to become building programs, which will fail to compile on older versions and then trigger the "note: module requires Go 1.N". But we cannot redefine existing building programs to have a different meaning. We've committed to that restriction.

That is, we have taken "using -lang to redefine existing syntax" - in the sense of changing a program from one working meaning to another - off the table entirely.

3059 was not "redefine existing syntax" though. It was "break working program", which is exactly what the Go version in the module file (passed to the -lang flag) is for. So that came out right.

rsc commented 3 years ago

The discussion here seems to be leaning toward (1) remove the feature entirely for Go 1.16.

The rationale is that it's a fairly minor improvement, we can't ship it compatibly in Go 1.16, and it doesn't really seem worth the pain of a multistep transition.

JiamingLinn commented 3 years ago

Could it support multiline? Like this:

type MyStruct struct {
  Field1 string `json:"field_1,omitempty"`
                `bson:"field_1,omitempty"`
                `xml:"field_1,omitempty"`
                `form:"field_1,omitempty"`
                `other:"value"`
}
ianlancetaylor commented 3 years ago

@JiamingLinn See #15893, #38641, #42182.

leitzler commented 3 years ago

@batara666 please be respectful in your comments as per CoC: https://golang.org/conduct. Saying that something is "ugly" really doesn't add anything constructive.

rsc commented 3 years ago

Based on the discussion above, this seems like a likely accept.

rsc commented 3 years ago

Sorry for confusion - by accept I really meant roll back. Will update header.

batara666 commented 3 years ago
type MyStruct struct {
    Field1 string `json bson xml form:"field_1,omitempty" other:"value"`
}

Alright, we accept this version xoxo πŸŽ‰πŸŽ‰πŸŽ‰πŸŽ‰πŸŽ‰