golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.98k stars 17.67k forks source link

cmd/go: stamp the pseudo-version in builds generated by `go build` #50603

Open 4ad opened 2 years ago

4ad commented 2 years ago

NOTE: The accepted proposal is https://github.com/golang/go/issues/50603#issuecomment-2181188811.


cmd/go embeds dependency version information in binaries, which is very useful. From Go 1.18 onwards, cmd/go also embeds VCS information in binaries, which makes it even more useful than it was before.

As #37475 mentions, people place version information in binaries using -ldflags='-X foo=bar', which requires an additional build wrapper. The new VCS stamping feature of cmd/go should alleviate the need for external wrapper, but I am afraid it comes short.

The version information, in the sense of Go's pseudo version is not recorded for the main module when doing go build:

: emerald:ver; go build
: emerald:ver; go version -m hello | grep 'mod.*hello'
    mod mgk.ro/hello    (devel) 
: emerald:ver; 

The version is recorded as expected when doing go install:

: emerald:ver; go install robpike.io/ivy@latest
go: downloading robpike.io/ivy v0.1.124
: emerald:ver; go version -m `which ivy` | grep 'mod.*ivy'
    mod robpike.io/ivy  v0.1.12 h1:qI7dnEiXhorB+za07W6qX3sG+IvBK4EUl38vUHAf53Q=
: emerald:ver; 
: emerald:ver; 

I am afraid this limitation of cmd/go will continue to force people to use external build wrappers that set -ldflags, which is rather unfortunate.

I am not the first to want main module version information in binaries, this has been already asked for in various issues, for example in #29814, which was closed as a duplicate of #37475, but it really wasn't a duplicate, as #37475 is about VCS information, and #29814 is about semantic versioning. Other examples of people asking for this feature are mvdan/sh#519 and https://github.com/golang/go/issues/29228#issuecomment-449554128 where various workarounds were proposed.

Speaking of workarounds, the only workaround that I know that currently works would be to create a local module proxy and pass GOPROXY to go install, but that is an extremely high-overhead workaround, and go install is not a replacement for go build anyway, since go install comes with some rather severe limitations regarding how vendoring works and what you can put in go.mod, and go install doesn't support controlling GOBIN when cross-compiling.

I realize that Git tags are a local concept, and by doing the "wrong" git operations one could come up with a different pseudo-version for the same source code. I am afraid I don't have any solution or suggestion regarding this git misfeature, except to note that even in this case the hash information is recorded correctly, and in every case by the virtue of having access to the local source code the programmer can always do some local operation that has the potential to cause a version mislabeling. Git is just more prome to do this by accident, but the ability is there, always.

I don't have any stats to back this up, but from my experience most corporate source code is built by go build, not go install, and it would be great if somehow Go's notion of versioning would be stamped by go build.

CC @bcmills @mvdan @rsc

mvdan commented 1 year ago

Here's something I think we never covered - would this stamping from VCS info always happen, or only when -buildvcs is enabled, be it via -buildvcs=true or when the default -buildvcs=auto finds the git tool and .git data to be present? I ask this in the context of https://github.com/golang/go/issues/53976#issuecomment-1711440904.

ericrrath commented 11 months ago

Expanding on the comment by @maja42, another use-case is vulnerability scanners. I think all these scanners work by examining the artifact, e.g. a golang binary, and extracting enough information to derive a CPE for each distinct application/library/module, and then looking up any vulnerabilities for those CPEs in whatever vuln database is used. If the scanner can't derive a CPE for a given component, then it can't lookup vulns for that component.

Trivy currently uses debug/buildinfo to derive CPEs for the dependencies, but does not look at the "main" module, possibly because it rarely sees usable values for the "main" module (pure speculation on my part).

If the debug/buildinfo fields are the canonical place to put this kind of name-and-version information, then populating it in more cases would allow the scanners to report more vulns.

samthanawalla commented 4 months ago

Thank you for all your ideas. This is definitely an important feature. After considering all of your ideas and discussing with @matloob and @rsc, we have landed upon this format:

Only in the presence of local VCS information: runtime/debug.BuildInfo.Main.Version for a go build will not be set to (devel) anymore but instead it will have the following version:

v[tag]+[optional dirty]

When the current commit matches a tagged version.

[optional dirty]

dirty: there are uncommitted changes in the build. A missing dirty label indicates that the build is ‘clean’ and has no uncommitted changes.

Example:

[Pseudo version]+[optional dirty]

When the current commit does not match a tagged version.

Example:

If you have any feedback, please let us know! We will try to get this in 1.24.

apparentlymart commented 4 months ago

The proposed format makes sense to me when either the work tree is dirty or the currently-selected commit doesn't exactly match a tag.

It's not clear to me from the most recent comment whether there would also be a special case to use the regular version number alone when the current commit exactly matches a tag. Ideally I'd like for that case to be indistinguishable from the treatment of libraries, but I don't know if there are hurdles that make that infeasible.

If the outcome were to use the proposed format in the dirty or not-tagged case but to use the tag-derived version number when at tag exactly matches then I think that would be sufficient for the codebase I help maintain in my day job to drop its own redundant version number tracking and use the toolchain-generated version information exclusively, which would be great because we'd then get more reliable information for non-release builds (that are always just called "v1.2.3-dev" for us today).

apparentlymart commented 4 months ago

One potential caveat:

Some version constraint systems use lexical ordering to decide which is the newest between two prerelease versions that have the same base version. In that case v1.2.3-devel-anything would sort after v1.2.3-alpha-anything or v1.2.3-beta-anything.

I don't think that's a blocking concern -- it's questionable whether it's ever meaningful to sort development builds relative to release builds anyway -- but wanted to raise it anyway since it was a hazard I've run into in the past in a different context.

seankhliao commented 4 months ago
hyangah commented 4 months ago

From https://github.com/golang/go/issues/50603#issuecomment-1069531971 I thought we'd use the exact tag as the version, with extra build setting info (e.g. VCS info or lack of it, install-vs-build, etc). I am still in favor of the idea because:

More specific example:

Let's assume we found a vulnerability in a binary built from a clean checkout of v1.2.3, and it's fixed in v1.2.4. All release note, bug triaging, etc are done with this short tag v1.2.3 most likely, but the released binary has v1.2.3-devel-20240620130020-daa7c0413123 since go build was used. That can be confusing.

When creating a CVE, we should use v1.2.3-devel-20240620130020-daa7c0413123 as the vulnerability introduction version, but use v1.2.4-devel-2024... as the fix version. This is less user-friendly. Moreover, I heard version range comparison involving pseudo versions can be complex. (@golang/vulndb, @neild)

The version in the build info is also used by the Go telemetry. To collect the telemetry, the version string should be explicitly listed in the configuration. I don't think the binary with version v1.2.3 and the binary with v1.2.3-devel-20240620130020-daa7c0413123 are different enough to warrant separated tracking.

zpavlinovic commented 4 months ago
  • What happens when there's no tag at the current commit?

I guess v0.0.0 will be used?

  • The separator used is -, but semver uses . to separate build identifier components, any specific reason for using -?

Could you expand on that? - is currently used in pseudo versions.

sudo-bmitch commented 4 months ago
  • The separator used is -, but semver uses . to separate build identifier components, any specific reason for using -?

Could you expand on that? - is currently used in pseudo versions.

It should be one dash followed by dot separated components: https://semver.org/#spec-item-9

seankhliao commented 4 months ago

The current pseudoversions use different base versions based on existing tags in the current of parent commits:

  • vX.0.0-yyyymmddhhmmss-abcdefabcdef is used when there is no known base version. As with all versions, the major version X must match the module’s major version suffix.
  • vX.Y.Z-pre.0.yyyymmddhhmmss-abcdefabcdef is used when the base version is a pre-release version like vX.Y.Z-pre.
  • vX.Y.(Z+1)-0.yyyymmddhhmmss-abcdefabcdef is used when the base version is a release version like vX.Y.Z. For example, if the base version is v1.2.3, a pseudo-version might be v1.2.4-0.20191109021931-daa7c04131f5.

The dot separator (.)0. is used to ensure these sort before any user provided build identifiers.

Given that we already have existing pseudoversions that can identify commits, I'm not sure why we need a separate format for local builds, beyond marking it as either a local clean or dirty build (maybe build metadata +clean vs +dirty?)

samthanawalla commented 4 months ago

It's not clear to me from the most recent comment whether there would also be a special case to use the regular version number alone when the current commit exactly matches a tag.

I see your point. It would be useful to drop the pseudo version if the current commit matches a tag. My understanding is that they may need to be separate. A local build might inherently be different than one from one that is go install'd and such we don't want to confuse the two. I will discuss more with @rsc and @matloob and circle back.

v1.2.3-devel-anything would sort after v1.2.3-alpha-anything

Interesting point. Like you said, I don't think it would be meaningful to sort a local build against other pre-release versions, but let's add a -0 to have it be less than any pre-releases. So we have v1.2.3-0-devel.

What happens when there's no tag at the current commit?

vX.0.0 if there is no such tag.

Sorting before seems a bit weird, especially if there are local changes layered on top of an existing tagged/release version?

Good point! Let's have it sorted after. Upon further investigation, copying the psuedo-version behavior from go install makes sense here which bumps the version if the commit is newer than the last release. https://cs.opensource.google/go/go/+/master:src/cmd/vendor/golang.org/x/mod/module/pseudo.go

So we will perform this operation: vX.Y.(Z+1) if a tag exists.

The separator used is -, but semver uses . to separate build identifier components, any specific reason for using -?

This we're not completely sure on. But I suppose a combination of '.' and '-' makes more sense.

Taking into account the above points, maybe instead we should do:

v[X.Y.(Z+1)]-0.[timestamp].devel-[commit]+[optional dirty]

Given that we already have existing pseudoversions that can identify commits, I'm not sure why we need a separate format for local builds

I agree with you. I think it makes sense to closely follow the existing structure from: https://cs.opensource.google/go/go/+/master:src/cmd/vendor/golang.org/x/mod/module/pseudo.go

@hyangah I suppose it comes down to the intent of how these pseudo-versions will be used. It's not clear to me if there should be a clear separation between a local build and one based on a go install. If we drop the pseudo version when the current commit matches the tagged version, then that might clear up some confusion. Let me discuss this more in depth with @rsc and @matloob and I will update when I have a clearer answer.

Thanks for the help guys.

I will update my original comment to reflect these changes.

mvdan commented 4 months ago

I want to reiterate what @hyangah said - as of 2022, and when the proposal was accepted, we had already reached consensus that a local build of a commit on a semver tag should result in a module version reflecting just that semver string. See https://github.com/golang/go/issues/50603#issuecomment-1070789161 for example. I don't think new information has come to light since then, so I'd be very confused if we suddenly decided to implement something else.

zpavlinovic commented 4 months ago

Further, if a proposed pseudo-version of a clean local commit with a semver tag is syntactically different from the semver tag, then how does that affect version.Compare? It should return 0 imo, but I have no idea how one would actually implement that.

matloob commented 4 months ago

@hyangah @mvdan About stamping the tagged version without any suffixes when doing a go build in a clean repo:

I've been thinking about this and I can't prove to myself that we'd get the same build running go build package from within the module and from running go install package@version. It seems like at the least we'd have to fall back to adding a suffix if there's a replacement or exclude because in those cases go install would refuse to do the build.

I'm also not totally sure about how module pruning affects a module loaded from go install package@version vs go build. Looking at the comment @mvdan linked it looks that Bryan thought this was okay, but I would think that going through go install package@version vs go build the lazy loading horizon would be at a different level because with go install package@version, the module containing the package isn't treated as the main module? I'll try to think about this some more.

It's likely there's no issue there and I'm overthinking it. It might also be that these issues don't matter because the version is not meant treated as 100% accurate.

samthanawalla commented 4 months ago

@zpavlinovic how does that affect version.Compare?

I believe version.Compare only compares go versions. I think you meant semantically comparing module versions?

If we cannot guarantee the same behavior of go install and go build with @matloob's concerns then it might make sense to add a +build suffix? (similar idea to what Sean had suggested with +local) i.e. v1.2.4+build This would compare equally to v1.2.4 but imply that they were built differently.

mvdan commented 4 months ago

It might also be that these issues don't matter because the version is not meant treated as 100% accurate.

That is my understanding from the previous consensus. go install pkg@version via a GOPROXY blindly trusts the info from the proxy to produce a module version. Similarly, with GOPROXY=off, we blindly trust the info from the original VCS state, which can easily be altered in many cases. Blindly trusting the local VCS state does not seem any worse in this respect - if anything, it's consistent.

I don't have a strong opinion on a suffix like +build or +local to distinguish local builds, but I worry it could easily confuse users and cause issues for module developers who now may need to deal with this suffix inconsistency. If the produced build is otherwise equivalent to go install pkg@version with GOPROXY=off, then I definitely think the resulting module version should be the same.

It's also worth noting that one could often use the presence and contents of the stamped -buildvcs metadata to detect that a build was made locally and a module version was derived from it. If we only stamped a VCS-derived module version for a local build when -buildvcs was enabled, then a developer could use the presence of the VCS information to tell that a module version was derived from the local VCS state. That is what I would find most intuitive as both a user and developer.

rsc commented 4 months ago

I too am confused why we would define a new pseudo-version syntax instead of using the existing algorithm and derivation code. Let's talk more about this next time we meet, @matloob and @samthanawalla.

samthanawalla commented 4 months ago

@rsc @matloob and I discussed this today.

I have revised my original comment to reflect the decision we came to. See #50603 (comment)

When the current commit matches a tagged version we will use v[tag].

When the current commit does not match a tagged version or there are uncommitted changes, we'll use the existing pseudo version format. along with an optional dirty tag.

(We dropped the devel and the +build)

DavidGamba commented 4 months ago

Are uncommitted changes only considered for the directory where the go.mod lives? I work in monorepo environments and rarely ever achieve a full clean repo.

samthanawalla commented 4 months ago

@DavidGamba Are uncommitted changes only considered for the directory where the go.mod lives?

We were planning to rely on what the VCS state is, I.e. git status. I don't think it makes sense to only consider the directory where go.mod lives because it introduces additional states to consider which I think may make the version info more confusing. But I could be wrong.

However as a compromise, would v[tag]+dirty work for your use case?

If your current commit matches a tagged version but you have uncommitted changes, you would get v[tag]+dirty instead of [pseudoversion]+dirty.

DavidGamba commented 4 months ago

I would prefer to be able to push a binary as is from my machine if there are no changes in the the given go module. Other than a go.work file and possible env vars there shouldn't be any dirty files affecting the go binary itself so marking the binary as dirty when there are no changes in the module itself seems overly cautious.

it introduces additional states to consider

I am not sure I know enough about Go to know anything else other than the go.work and env vars that introduce additional states. I would love to learn a bit more about what else affects a go module.

At the end of the day, having the version stamped will be a win even if I can only get the non-dirty version out of CI or a clean clone.

samthanawalla commented 4 months ago

By additional states I meant to say version control states.

I do get your point but a tag is more a property of the repo and not just the main module. Changes to the repo as a whole will reflect accordingly in the stamped version.

While it may be overly cautious, I don't think we will support this use case as of now. But that could change in the future if necessary.

Updated #50603 comment to include v[tag]+dirty use case.

matloob commented 4 months ago

@DavidGamba I would be interested in understanding your use case better. Our expectation is that those planning to do a build of a go program at a given version would check out the appropriate version and do a clean build. Our plan is to reuse the vcs.modified field in the buildinfo in determining whether a build is clean. That would help avoid doing extra work for what we believe is an uncommon case. But if it isn't an uncommon case we'd like to know.

zpavlinovic commented 4 months ago

vcs.modified seems to be set here. My understanding of the code is that this value is false by default unless one of known versioning systems is used. For those systems, status command is used to get the value for modified using the current working directory.

I would prefer to be able to push a binary as is from my machine if there are no changes in the the given go module. Other than a go.work file and possible env vars there shouldn't be any dirty files affecting the go binary itself so marking the binary as dirty when there are no changes in the module itself seems overly cautious.

The more and more I think about this, the more I get the feeling that having a completely precise solution is either extremely hard or impossible.

If a module has a replace directive pointing to a local directory outside of the module, then changing that replacing directory content could result in a different binary. To make things more complicated, this replacing directory might be outside of a repo where the module is.

It seems to me that the current approach is making a decent compromise. It will cover the more common case where the replacing module is in the same repo. Of course, it might add +dirty if part of a repo completely unrelated to module is being changed.

Regarding monorepo, it seems that the current approach will always have vcs.modified set to true. An example is svn (it does not have a Status method so computing modified will be skipped). It would be good to verify that.

DavidGamba commented 4 months ago

My use case is not a major use case since official builds will come from CI. It is just in those cases where you want to push something out that doesn't have a pipeline yet. Cloning a clean repo for those is not a major inconvenience and worst case the binary will just have the +dirty label.

gopherbot commented 3 months ago

Change https://go.dev/cl/596035 mentions this issue: cmd/go: stamp the version for binaries built with go build.

thepudds commented 3 months ago

Would someone mind summarizing exactly where this new version info will show up in runtime/debug.BuildInfo?

Is it runtime/debug.BuildInfo.Main.Version?

(Sorry if this is spelled out above – I did some re-skimming and some Ctrl-F, and I saw a couple of suggestions, I’m not sure I saw a definitive statement).

samthanawalla commented 3 months ago

@thepudds Is it runtime/debug.BuildInfo.Main.Version?

Yes! No worries, there's a lot of comments.

Updated #50603 comment to reflect this.

samthanawalla commented 2 months ago

https://go.dev/cl/596035 is merged which adds version stamping for git only. This will come out in 1.24.

I have not looked into the other VCS's hg(Mercurial), svn(Subversion), bzr(GNU Bazaar), or fossil and unsure whether they will be added by 1.24.

I will leave this issue open for now.

mvdan commented 2 months ago

Thanks @samthanawalla, I just gave it a try and it seems to be working as advertised!

I'm still slightly surprised at the appearance of +dirty when we already track this information via vcs.modified. Why duplicate the information when it is already there? It doesn't give me 100% assurance that the lack of +dirty means the version is to be trusted anyway, as a local git tag can be modified, as discussed before. Personally I would omit this feature - it wasn't in the original design, and I still don't understand why it has been added.

I also notice that the version stamping does not happen with -buildvcs=false. That's probably reasonable, but we should document this properly. I had asked this question some time back in https://github.com/golang/go/issues/50603#issuecomment-1711441610 but the proposed design wasn't clear before we had an implementation.

samthanawalla commented 2 months ago

@mvdan Awesome, glad to hear!

While vcs.modified technically captures this information, the decision to surface it within the version is driven by making it immediately visible to those who primarily focus on the version. It's not necessarily motivated by trust but rather transparency and better 'book keeping'. It more directly conveys the relationship between tag information and how VCS changes can influence the build.

Does +dirty create any problems for tooling? My understanding is that it can easily be chopped off for those who don't need it but is useful for those who may intentionally or unintentionally ignore vcs.modified and only look at the version.

If there are specific use cases where +dirty causes issues for tooling then we can still change that as there is plenty of time till the 1.24 release. However we believe the enhanced visibility it provides into the build's status outweighs any potential drawbacks.

Of course, we're open to hearing further feedback and specific examples :)

I also notice that the version stamping does not happen with -buildvcs=false

Yes correct. Sounds good I will be sure to document that aspect as well.

gopherbot commented 2 months ago

Change https://go.dev/cl/605615 mentions this issue: doc/next: document version stamping for go builds within a Git Repo

mvdan commented 2 months ago

I don't feel strongly about the presence of a +dirty suffix, nor do I have a specific use case that gets broken by it. Like you say, it can be chopped off fairly easily. Although I could say almost the same about adding it myself from vcs.modified - this is more a matter of what the default behavior should be. I don't think choosing to be explicit and verbose by default is wrong per se.

However, I do find it an odd default because I often end up with +dirty builds when the binary is identical. For example, I regularly capture cpu or memory pprof files, and I sometimes leave them around for a while as they are harmless. Because I don't explicitly gitignore them, they make my local binaries +dirty, even though they don't participate in the build process at all.

Another similar scenario that I'm running into is building binaries for many platforms at once. I tend to build the binaries in the local directory, inside the module, to then upload them as release archives. And once again they are not gitignored - which is fine, as I delete them soon after, and they don't participate in the build. But once again, +dirty shows up for any build after the first one:

$ GOOS=linux GOARCH=amd64 go build -o binary-linux-amd64 ./cmd/cue
$ GOOS=linux GOARCH=386 go build -o binary-linux-386 ./cmd/cue
$ ./binary-linux-amd64 version
cue version v0.11.0-0.dev.0.20240819151828-81d6f8bfcd61

go version devel go1.24-527610763b 2024-08-15 23:43:00 +0000
      -buildmode exe
       -compiler gc
  DefaultGODEBUG asynctimerchan=1,gotypesalias=0,httpservecontentkeepheaders=1,tls3des=1,tlskyber=0,x509keypairleaf=0,x509negativeserial=1
     CGO_ENABLED 1
          GOARCH amd64
            GOOS linux
         GOAMD64 v3
             vcs git
    vcs.revision 81d6f8bfcd61eeff2ec60ac71c8cc9867e6853e9
        vcs.time 2024-08-19T15:18:28Z
    vcs.modified false
cue.lang.version v0.11.0
$ ./binary-linux-386 version
cue version v0.11.0-0.dev.0.20240819151828-81d6f8bfcd61+dirty

go version devel go1.24-527610763b 2024-08-15 23:43:00 +0000
      -buildmode exe
       -compiler gc
  DefaultGODEBUG asynctimerchan=1,gotypesalias=0,httpservecontentkeepheaders=1,tls3des=1,tlskyber=0,x509keypairleaf=0,x509negativeserial=1
     CGO_ENABLED 0
          GOARCH 386
            GOOS linux
           GO386 sse2
             vcs git
    vcs.revision 81d6f8bfcd61eeff2ec60ac71c8cc9867e6853e9
        vcs.time 2024-08-19T15:18:28Z
    vcs.modified true
cue.lang.version v0.11.0

Arguably this is all an issue affecting vcs.modified as well, which can be seen by cue version above printing the BuildInfo setting lines as well. I would agree with that - if any files make VCS "dirty", but go list can tell that they do not participate in the build (not known source files, not used for go:embed, etc), then they shouldn't make built binaries "dirty".

Even though this was already an issue for vcs.modified, I honestly didn't really care too much before this point. Now that it's causing me +dirty suffixes in locally built module versions more often than not, it is causing me some issues. So I'd be fine with keeping +dirty suffixes as long as them and vcs.modified did not care about the dirtiness of files which do not participate in the Go build.

Speaking of... what if I had a new uncommitted Go file which did participate in the Go build, but was gitignored, so it wouldn't show up in git status? While this is technically not "VCS dirty", I would argue it should still mark the binary version and vcs.modified as dirty, because the Go tool can tell that it built a source file which is not tracked by VCS at all. Does this happen today?

hyangah commented 2 months ago

@samthanawalla Thanks for this feature! This will help simplify the future release workflow of vscode go release greatly.

One more tuning request: I noticed that some projects use the tools.go hack to pick govulncheck or gopls versions. (note: imo, listing tools like gopls to tools.go is always a mistake, but listing tools like govulncheck in tools.go is understandable)

However, tools built this way can be very different from the tools built with go install <tool>@<version> or from a clean checkout of the tool's source code.

For example,

$ cat go.mod
module xxx

go 1.21.0

require (
    golang.org/x/tools/gopls v0.16.2-pre.1
    golang.org/x/vuln v1.1.0
)
...
$  gotip build golang.org/x/tools/gopls

This gopls binary is very different from gopls@v0.16.2-pre.1.

$ gotip install golang.org/x/tools/gopls@v0.16.2-pre.1
$ gotip version -m ~/go/bin/gopls > install.txt
$ gotip version -m ./gopls > unclean.txt
$ diff install.txt unclean.txt
1c1
< /Users/hakim/go/bin/gopls: devel go1.24-f38d42f2c4 Wed Aug 21 01:11:27 2024 +0000
---
> ./gopls: devel go1.24-f38d42f2c4 Wed Aug 21 01:11:27 2024 +0000
7,8c7,8
<   dep golang.org/x/mod    v0.19.0 h1:fEdghXQSo20giMthA7cd28ZC+jts4amQ3YMXiP5oMQ8=
<   dep golang.org/x/sync   v0.7.0  h1:YsImfSBoP9QPYL0xyKJPq0gcaJdG3rInoqxTWbfQu9M=
---
>   dep golang.org/x/mod    v0.20.0 h1:utOm6MM3R3dnawAiJgn0y+xvuYRsm1RKM/4giyfDgV0=
>   dep golang.org/x/sync   v0.8.0  h1:3NFvSEYkUoMifnESzZl15y791HH1qU2xm6eCJU5ZPXQ=
12c12
<   dep golang.org/x/vuln   v1.0.4  h1:SP0mPeg2PmGCu03V+61EcQiOjmpri2XijexKdzv8Z1I=
---
>   dep golang.org/x/vuln   v1.1.0  h1:ECEdI+aEtjpF90eqEcDL5Q11DWSZAw5PJQWlp0+gWqc=
18c18
<   build   DefaultGODEBUG=asynctimerchan=1,gotypesalias=0,httplaxcontentlength=1,httpmuxgo121=1,httpservecontentkeepheaders=1,panicnil=1,randseednop=0,tls10server=1,tls3des=1,tlskyber=0,tlsrsakex=1,tlsunsafeekm=1,winreadlinkvolume=0,winsymlink=0,x509keypairleaf=0,x509negativeserial=1
---
>   build   DefaultGODEBUG=asynctimerchan=1,gotypesalias=0,httplaxcontentlength=1,httpmuxgo121=1,httpservecontentkeepheaders=1,randseednop=0,tls10server=1,tls3des=1,tlskyber=0,tlsrsakex=1,tlsunsafeekm=1,winreadlinkvolume=0,winsymlink=0,x509keypairleaf=0,x509negativeserial=1

Once the tool dependency proposal #48429 is implemented, some users may attempt to pick a specific version of govulncheck (or gopls) and such unexpected and untested versions of tools may appear more often. Currently golang.org/x/telemetry uses the main module's version string as the tool's version. The crash reports and counters from such untested versions are not distinguishable, and can add much noise.

Is it possible to use a different version string (+dirty) or add an extra build setting info if a tool is built this way? That will help us filter out telemetry from unexpected builds of tools.

matloob commented 2 months ago

@hyangah I had a similar (though slightly different concern) that the build with go build would get a different build list from the one using go install. I think once a build is being done outside of the go install pkg@version context, the version of the installed binary's module tells us a lot less about the dependencies used to build the binary.

What do you think about adding a buildinfo field for the module path of the main module (or whether the build was done from a workspace, or from a pkg@version context) and telemetry could treat the binary different depending on which it is?

gopherbot commented 2 months ago

Change https://go.dev/cl/595376 mentions this issue: extension/tools/release: add build-vscgo subcommand

samthanawalla commented 2 months ago

@mvdan I see your point. Multiple sequential builds will create a +dirty tag because of the binary that gets created which does not seem ideal. I will see about making an exception for that.

As far as the dirtiness of files which do not participate in the Go build affecting +dirty, we discussed that above: https://github.com/golang/go/issues/50603#issuecomment-2195420749

We think the most common case is a clean build from CI. Otherwise can clone & build from a clean repo.

Yes an uncommitted file that is gitignored will produce a build without a +dirty tag. This is a drawback of the current implementation. However if this is an uncommon situation, then perhaps we may need to compromise somewhere.

gopherbot commented 2 months ago

Change https://go.dev/cl/609155 mentions this issue: cmd/go: add Mercurial based version stamping for binaries

mvdan commented 2 months ago

@samthanawalla @matloob I feel like whether the builds happen locally or in CI is a red herring. Whether I do the builds locally or in CI, placing the binaries inside the same cloned repository seems perfectly fine to me. They don't affect the build in any way, nor do they touch any committed files.

git describe --always --dirty agrees with me here, for what it's worth; it only adds a -dirty suffix if I modify or delete any tracked files, or if I git add any new files. Any new untracked files, even if they are not gitignored, do not cause a -dirty suffix. So our notion of "dirty" seems to misalign with git's, and git is easily the most common VCS system people are using.

And as explained before, I think it's a mistake for go build to not mark a binary as dirty if a gitignored Go file is present in the build. I just tested this, and the resulting build was not marked as "dirty", even though it definitely built different source code than what is committed.

I realise my arguments are against vcs.modified; your addition of +dirty simply follows vcs.modified. I think both follow a flawed notion of "dirtyness" that is not useful to Go developers and can be confusing to Go users. I think we should adjust it so that:

1) A build is dirty when there are new and VCS-ignored files which contribute to the build (e.g: Go files part of a built Go package, or any embedded files, etc). 2) A build is not dirty when there are new and non-VCS-ignored files which do not contribute to the build (e.g: built Go binaries, or pprof files, etc).

That is, Go builds should only be marked as "dirty" if we can tell that they used a different set of source files compared to what is committed. That is honestly what I thought vcs.modified already was, and I think it would be a much better definition than what we currently have. As currently implemented, I'm afraid that I'll have to strip or ignore vcs.modified build settings and +dirty version suffixes, because they'll often confuse users for no benefit.

samthanawalla commented 2 months ago

@mvdan Conclusion from the contributor bi-monthly tools meeting: While we'd ideally like a completely precise solution for determining dirtiness, it looks like that would require some extensive modifications to the build system (we don't track all files that contribute to the build right now.) This might also introduce significant overhead and slow down builds. Given the potential complexity and impact, let's move this discussion to a dedicated proposal for refining vcs.modified.

gopherbot commented 2 months ago

Change https://go.dev/cl/611916 mentions this issue: cmd/go: prevent git from fetching during local only mode

folays commented 1 month ago

Sorry to piggy back on this issue : The embedding of VCS information in the binary doesn't work when built from a git WORKTREE.

It would be somewhat convenient, a worktree seems to have a /.git file (not folder) containing : gitdir: /path/to/a/subfolder/of/the/.git/folder/

Would be awesome if it were to be handled :) Thanks !

mvdan commented 1 month ago

@folays that is https://github.com/golang/go/issues/58218.