Open 4ad opened 2 years ago
At least speaking personally, for cases like https://github.com/mvdan/sh/issues/519, my intent is to show something like devel ${GIT_SHA}
when someone does a local Go build out of a git checkout. If someone is manually cloning and building, as opposed to the advertised and easier go install url@latest
, I imagine they know what a git hash is. So what 1.18 is currently shipping with is enough for my needs.
It's true that something like a proper module version might be more useful; a git commit hash doesn't give any hint as to how old a version is, whereas a semver version prefix or a timestamp can give a starting point. So, in principle, I agree with you: 1.18 is a big step forward, but it's still unfortunate that the main module version remains as (devel)
for local builds.
However, in practice, I still agree with Jay's comment in https://github.com/golang/go/issues/29228#issuecomment-755332642; we shouldn't make such a "locally inferred version" look like a normal version, because it's reasonably likely to be wrong or cause confusion with users.
in every case by the virtue of having access to the local source code the programmer can always do some local operation that has the potential to cause a version mislabeling.
Could you give some examples? I can only think of very unlikely scenarios, such as manually corrupting the module download cache after downloading some dependencies. That cache is read-only by default, and go mod verify
exists to double-check the contents too.
With the main module in a git checkout, I can think of multiple scenarios which seem more likely:
I think that, if we are to implement something like this, the versions must be somehow different from the canonical and unique versions that get computed from fully published commits and tags. This would make it very clear that the versions are inferred from local state, and not guaranteed to be correct. As a simplistic example, imagine that tagging v1.2.3
locally results in a build whose main module version is devel v1.2.3
, but when pushed and go install
ed, gets the version v1.2.3
.
we shouldn't make such a "locally inferred version" look like a normal version, because it's reasonably likely to be wrong or cause confusion with users.
To add a more concrete example: if we made the change proposed here, and locally inferred versions looked like fully published versions, I would have a harder time trusting the output of shfmt -version
when my users report bugs. I would have to update the issue template to also ask: did you build from a modified git checkout?
Could you give some examples? I can only think of very unlikely scenarios, such as manually corrupting the module download cache after downloading some dependencies. That cache is read-only by default, and go mod verify exists to double-check the contents too.
I was thinking of the case where since Go itself doesn't expose its own concept of a version to the program, the users themselves are forced to create their own concepts of a version, either through things like VERSION
files, or through some build wrappers. By definition, any such concept is under user's control, and the user can and will make mistakes. In fact, from experience, users try to naively use git tags for this which then fail for precisely the reasons you just explained.
Let me rephrase my point. Go can't enforce any useful properties for the user's notion of a version because it doesn't know about it, and as such if we make userVersion==moduleVersion
, the fact that Go can't enforce any properties is neither better nor worse for the user. The user is on the hook for doing the right thing in both cases. In one case the user must properly maintain their VERSION
, and in the other case the user must properly maintain their git checkouts.
The user does gain something in the latter case though. They don't have to create build wrappers.
With the main module in a git checkout, I can think of multiple scenarios [which might fail ... ] I think that, if we are to implement something like this, the versions must be somehow different from the canonical and unique versions that get computed from fully published commits and tags. [...] As a simplistic example, imagine that tagging v1.2.3 locally results in a build whose main module version is devel v1.2.3, but when pushed and go installed, gets the version v1.2.3.
I very much agree with this, with one caveat. If the locally checked-out version is identical to a published release, I would expect the version to match the release. If the locally checked-out version can not be guaranteed to match any release, then yes, it should be published with something like devel v1.2.3
(which matches what Go does, but why not v1.2.3-devel
or v1.2.3-unknown
, which is semver-compatible?).
Unfortunately, I can't imagine how this would work without internet access, and quite often a prerequisite of automated systems running go build
is to not go to the Internet.
Hold on, another thought. If we always add the commit hash, and some other metadata to the main module version for local builds, essentially always making them a fully qualified Go pseudo-version, then they will always be different from the published version, so there's no potential for confusion there.
Even better, in semver terms these builds will sort before the published version, which is probably what people want.
For this, what I said earlier about
If the locally checked-out version is identical to a published release, I would expect the version to match the release.
can no longer be true, but perhaps that is ok as long as we come up with a documented and stable convention that describes versioning for local builds (as opposed to just dumping a "devel" in the metadata field).
The main caveat here, I think, is unpublished tags. If I create a local, unpublished tag for, say, v1.1000.0
, then my pseudo-versions will be v1.1000.0-0.2022…
, but everyone else's pseudo-versions may be on an arbitrarily lower version (say, v0.8.3-0.2022…
.
That may or may not be a significant issue, though: if we always use a pseudo-version, we'll at least have the commit hash as a common point of reference even if the base versions differ.
@4ad right, a local build can't always know what is or isn't published, as requiring a network roundtrip takes us back to square one.
Your idea of trying to stick to semver, and always using some form of pseudo-version which includes a hash, sounds good. With one caveat, though: the commit hash isn't enough to make the version unambiguous, because I can have infinite kinds of uncommitted changes that do not change the HEAD commit hash.
@bcmills good point about tags still messing with pseudo-versions, but at least if we always include a timestamp and some form of unique hash, then I think we're good. With the caveat above about uncommitted changes :)
We do have another hash available to us, though, which changes whenever any input Go code changes: the build IDs used for the build cache. I seem to recall that one such ID is embedded into binaries, too.
Not ideal, as such a hash also includes build parameters like GOOS or -tags, which don't normally affect versions. But at least it fixed the problem with uncommitted files in VCS.
Yes, uncommitted changes should be explicit in the pseudo-version, but I think we can suffix +
, just as we do with Go itself, no?
the new buildinfo already records whether the workspace is clean with vcs.modified=true|false
we could use one of +local
(for clean builds) or +dirty
(uncommitted changes, implies local) as the semver build id, attached to a pseudoversion which should make the situation clear enough?
So main will always have a version like
vX.Y.Z-timestamp-commit+local
vX.Y.Z-timestamp-commit+dirty
What is the main motivation of encoding the local version in pseudo-version style rather than keeping those extra info (timestamp?) as extra metadata fields - if it's not guaranteed that they are always available in the origin or proxies?
It seems like the vcs.time
is already there in the build info metadata.
BTW, I feel like the main module's version isn't sufficient to describe a tool's behavior in certain cases - go
version used to go build
, third-party tools dependencies, and go build
's behavior change (go.work
left over somewhere accidentally?) can affect a tool's behavior. So when triaging issues, I hope we develop best practice using go version -m
or richer build info dump rather than relying on the main module version string.
What is the main motivation of encoding the local version in pseudo-version style
The main motivation is that go install
does it, and many people expect to have a notion of a program's version available and want it, and because they don't have it with go build
, they rely on build wrapers or other workarounds, which are undesirable in the broader Go ecosystem.
encoding the local version in pseudo-version style rather than keeping those extra info (timestamp?) as extra metadata fields
Emphasis mine.
It's not rather than, It doesn't replace the existing metadata fields. If you want to read the metadata, you should read it from those fields instead of parsing the pseudo-version. However, that metadata is useful in disambiguating builds produced by go build
from published releases. Presumably we could come with some other kind of metadata for the same purpose, but since pseudo-versions are a de-facto standard in the Go ecosystem, why not reuse it?
I feel like the main module's version isn't sufficient to describe a tool's behavior in certain cases - go version used to go build, third-party tools dependencies, and go build's behavior change (go.work left over somewhere accidentally?) can affect a tool's behavior.
This sounds like an argument to always use the build ID as the version suffix instead of the VCS hash.
So when triaging issues, I hope we develop best practice using go version -m or richer build info dump rather than relying on the main module version string.
I hope so too, but again, I think that discussion is out of scope for this thread, which is more about bringing go build
in line with go install
and providing a solution for users that avoids build wrappers.
This proposal has been added to the active column of the proposals project and will now be reviewed at the weekly proposal review meetings. — rsc for the proposal review group
As a kind of experience report, I'm using custom build-scripts for years, solely to embed version information of the main module into the executable. When building, I attach the following information:
type BuildInfo struct {
// Major, Minor, Patch contain the semantic version components.
// In case of an unofficial build, this version is from a previous commit on the same branch.
Major, Minor, Patch int
// If true, a version-tagged commit was built.
OfficialBuild bool
// CommitHash is the full hash of the git commit.
// Note that the actually built source-code is different if LocalChanges is true.
CommitHash string
// CommitTimestamp is the timestamp of the git commit.
// Note that the actually built source-code is different if LocalChanges is true.
CommitTimestamp time.Time
// BuildTimestamp contains the timestamp when the project was built.
BuildTimestamp time.Time
// LocalChanges specifies if the project directory differed from the commit due to local (uncommitted) changes.
LocalChanges bool
}
With this kind of information, I'm able to build version-strings however I like. Usually I try to match go's pseudo-version strings, but when I need to stay compatible with the version-scheme from other, non-go projects, I can do so as well.
// VersionString returns the project's semantic version number without a leading 'v'.
// Patterns:
// <major>.<minor>.<patch>
// Built from an officially tagged commit (without local changes)
// 0.0.0-<buildTime>
// Unofficial build. There are no tagged commits yet.
// <major>.<minor>.<patch*>-<buildTime>
// Unofficial build. The commit was not tagged, or there were local changes.
// The used patch-version is +1 compared to the last tagged commit.
// The build timestamp (yyyymmddhhmmss) is the UTC time when the application was built / ctgover generated the version number.
func (b *BuildInfo) VersionString() string {
preRelease := ""
patch := p.Patch
if !b.OfficialBuild {
buildTime := b.BuildTimestamp.Format("20060102150405")
preRelease = "-" + buildTime
patch++
}
return fmt.Sprintf("%d.%d.%d%s", b.Major, b.Minor, patch, preRelease)
}
The version is printed when the application is started with a -version
command-line flag. But I guess go-tools could/should show the same version string.
I don't have anything against adding a build ID, but it wouldn't really solve my problem. My use cases are:
The previously raised issue that git-tags might only be local was never really an issue in my experience. Version-tags aren't made lightheartedly and are always immediately pushed to the server in the projects I work on.
I really hope we can this kind of information into a go-executable one day, because I would finally be able to get rid of all my build- and tool-scripts.
It sounds like @mvdan and @bcmills have some hesitation around the fact that these pseudo-versions would not correspond to any publicly available version, even though they look like those. That does seem like a reason not to do this.
We now have Git VCS info separately in the builds (as of Go 1.18; try go version -m). Do we need to add a second way to record that information?
It sounds like @mvdan and @bcmills have some hesitation around the fact that these pseudo-versions would not correspond to any publicly available version, even though they look like those.
We can make it unambiguously distinct, for example instead of v1.2.4-0.20191109021931-daa7c04131f5
we could use v1.2.4-0.unpublished.20191109021931-daa7c04131f5
or something like that.
We now have Git VCS info separately in the builds (as of Go 1.18; try go version -m). Do we need to add a second way to record that information?
No, we certainly only need one way to encode VCS info. The suggestion to put VCS info in the metadata field of the pseudo-version was to match the go install
behavior, but putting something else there, for example the build id is probably better. The build id also works, and is meaningful when you might not have VCS info, like from a tarball, which is a pretty common case where you'd use Scratch that idea, without VCS we can't detect the version either.go build
for.
We can make it unambiguously distinct, for example instead of
v1.2.4-0.20191109021931-daa7c04131f5
we could usev1.2.4-0.unpublished.20191109021931-daa7c04131f5
or something like that.
With replace statements in go.mod and the new workspace mode in Go 1.18 it is possible to build Go programs that include local versions of modules besides the main module. For those modules the go tool also lists (devel)
for their version.
The new build vcs metadata only helps identify the main module. Adding vcs metadata for all local modules seems valuable and not currently supported. The format suggested above by @4ad would be more informative than (devel)
. Maybe also including a dirty
flag if there are uncommitted local edits.
The new build vcs metadata only helps identify the main module. Adding vcs metadata for all local modules seems valuable and not currently supported.
This may be true, but the concern above seems to be adding vcs metadata that looks like a pseudo-version. It need not, and it probably should not. We can always add that separately; maybe that should be a different proposal. (I think this is the first comment to bring up VCS info for replaced modules that point to other local repos.)
A thought: if we're only concerned about having a reliable way to always get some useful version for the main module, I think it could be an API of its own, like debug.MainVersion() string
. It could first try to get the main module version from https://pkg.go.dev/runtime/debug#ReadBuildInfo, otherwise fall back to VCS information, and otherwise fall back to something that should always work, such as the binary's build ID.
I personally will be implementing logic like that to replace -ldflags=-X=main.version=...
in my projects, where I use a default of var version = "(devel)"
. And I think it should be useful to other projects as well, at least as a good starting point.
Another option, if we want this to also work for library modules, would be debug.OwnVersion
, which would do the equivalent but for the module containing the package that's making the function call. Perhaps that would cover @ChrisHines's needs. I maintain some libraries and I admit that reliably knowing my own version could help in terms of logging or capturing debug information.
If the above sounds interesting, I'll happily develop the idea further and create a new proposal. I realise it's not the same as this proposal, but I also think it could be a different solution to the same end-user problem :)
@mvdan I would very much like to see something like that to replace the boiler place build lines that exist in code at work.
We use the semantic version of the binaries in order to compute API compatibility between different binaries. I am afraid that if your debug.MainVersion()
doesn't always return some string that is compatible with semver, we will still have to resort to -ldflags=-X=...
.
Now, one might object to using the binary version in this way and perhaps recommend using a separately maintained API version instead that is separate from the binary version. I would tend to agree except this is outside my control. I do not have the operational liberty to change this.
@4ad the "VCS fallback" mentioned in my proposed API could still resemble a pseudo-version, in the sense that it could give you some semver information related to the last known compatible VCS tag. The reason I think it's less likely to cause confusion with real and published pseudo-versions is that the API docs could explicitly warn users against assuming that the version is a valid module version.
Put another way, my worry with the original proposal here is that, currently, the module versions embedded into binaries are documented and likely assumed to be valid and published. Changing that could be confusing or silently break programs, whereas a new API can avoid the "module version" terminology altogether, and isn't changing existing behavior that could break any programs.
I see, yes. That would work for us, provided there's a way to retrieve it from outside the binary (i.e. without running the binary).
provided there's a way to retrieve it from outside the binary
Do you mean via a cmd/go command that takes a path to a binary, or via a Go API that takes the path?
I'd be ok with either, I would prefer it to be in cmd/go
, so I wouldn't have to write another tool, but as long as there's an API I can use, I'm happy.
If I check out a tagged version of source code, let's say rsc.io/quote@v1.5.2
and go build
from the cloned, unmodified
repo, what will be the pseudo-version like? v1.5.2
, v1.5.3-....
, v1.5.2-...
?
If that is not v1.5.2
, do we want the go
command to report an error if someone tries to go install
with the special pseudo-version? Or, should the go module proxy serve data as if it's like a normal pseudo-version?
It would be a "fake local" version similar to a pseudo-version; we haven't defined what the format of that would be yet. I think we're all in agreemnet that it shouldn't look like a real pseudo-version, meaning that the format should be distinctly different, such as by containing a special suffix. That would then allow go install
or the module proxy to outright reject using those module versions, because they're not valid module versions.
@hyangah, we do have some logic today to convert versions with +build
metadata to canonical pseudo-versions.¹
I suspect we would reuse that same logic, so a checkout from v1.5.2
would probably show as a v1.5.3-0.…
pseudo-version.
¹https://cs.opensource.google/go/go/+/master:src/cmd/go/internal/modfetch/coderepo.go;l=482-492;drc=fa4d9b8e2bc2612960c80474fca83a4c85a974eb
Currently BuildInfo.Main.Version
can be trusted since it is only set for pristine builds pulled from a repo (otherwise (devel)
). Currently these pristine builds more or less require module proxy infrastructure (a significant barrier for many, especially private developers/repos).
I'd prefer to keep a single definition for the version stored in BuildInfo.Main.Version
(known version pulled from a repo). I think it would be better to provide an easy way to build a pristine private module without needing module proxy infrastructure. Eg:
go install -local mymod/cmd/foo@v1.0.0
This would keep the version definition the same and enable many developers who develop private modules locally to output pristine module builds. It does make it easier for someone to "fake" a version with a local tag, but this is already possible for sufficiently motivated developers. I'd prefer to optimise for easy of use.
@bcmills can you summarize the arguments for and against doing this?
The main argument against, at least as far as I understand, is that we might stamp the build with a tag that does not mean the same thing locally that it means in the published repo.
But I suppose that's also true of tags that could be moved in a private upstream repo (and fetched directly); as long as we also stamp the commit hash and/or module checksum for the main module, it's not necessarily a major impediment.
The argument for is, more or less, that many binary maintainers will build their binaries from within their repo (for example, to pick up local replace
directives), and that giving those binaries a version that sorts in with the versions stamped when installed outside of a module makes it easier for maintainers to identify exactly when a problem may have been introduced or resolved.
Talked to @bcmills and @matloob. It sounds like this is OK as long as it does not slow down builds too much.
FWIW, I think it would be very helpful to do something here. Based on my experience and what I've seen from others, it is not rare for teams working on closed-source Go to do their official builds via 'git clone' followed by 'go build' (or similar), without ever taking the time make their own code "go get'able" (or to at least start out that way until the Go code base grows; I think this is especially true in multi-language environments).
It would be nice to converge on how this would be formatted. I pulled together some related comment snippets from above (from a very quick re-read/skim, so apologies if I missed something, or if too snippetized):
From @4ad:
If we always add the commit hash, and some other metadata to the main module version for local builds, essentially always making them a fully qualified Go pseudo-version, then they will always be different from the published version, so there's no potential for confusion there.
and:
why not v1.2.3-devel or v1.2.3-unknown, which is semver-compatible?).
and:
we could use v1.2.4-0.unpublished.20191109021931-daa7c04131f5 or something like that.
Later from @seankhliao:
we could use one of +local (for clean builds) or +dirty (uncommitted changes, implies local) as the semver build id, attached to a pseudoversion which should make the situation clear enough?
So main will always have a version like
vX.Y.Z-timestamp-commit+local vX.Y.Z-timestamp-commit+dirty
From @chrishines:
Maybe also including a dirty flag if there are uncommitted local edits.
From @bcmills:
a checkout from v1.5.2 would probably show as a v1.5.3-0.… pseudo-version.
Does anyone object to adding this?
Based on the discussion above, this proposal seems like a likely accept. — rsc for the proposal review group
I don't oppose this, though I'd like to understand what version format is exactly being proposed, like @thepudds mentioned.
Giving this some more thought, I think we should stamp exactly the version that would be resolved if the repository were published upstream as-is.
The VCS stamp should already provide the commit hash and indicate whether the working tree is dirty, so if the meaning of that version changes it's more-or-less exactly as if the repo were published, the package built with GOPRIVATE
set, and then the tag were moved to refer to some other commit.
Why not just add vcs.tag
to buildinfo and let people use it as they see fit ?
@kgersen please read the thread before commenting on it.
@bcmills I've thought about it again and I've come around to seeing that we already can't always rely on stamped module versions when trusted module proxies aren't involved, as VCS tags can change under the hood. So I am fine with just stamping what's available locally no matter what, as long as we also stamp VCS information.
I think my only slight worry then would be: programs wishing to show their information should always print the VCS info (commit, date, dirty, etc) alongside the module version, because not doing so could potentially lead to confusing edge cases for authors such as tags having been added or modified locally. We should clarify that in the documentation.
Thinking outloud, we also don't need to worry about "what if VCS information isn't stamped?", because if that's the case, then we're not stamping the locally-inferred module version either.
@bcmills Re: https://github.com/golang/go/issues/50603#issuecomment-1069531971: Given the recent change that makes the vcs version stamping optional (-buildvcs=auto
) depending on the presence of the vcs cli tool, if this feature is implemented in that way, I think it's better to add an extra low cost tag that indicates the binary was built with go build
.
I was initially excited by this proposal and thought this would help #46880 by providing a convenient way to build an unstable version of gopls
. Then, realized that the version string will be a v0.0.0-
prefixed pseudo version because our dev branch won't have any sensible tag. In fact, this kind of repo setup (release in a separate branch) makes our automated gopls upgrade logic slightly more complicated. A reliable way to add an extra tag or suffix that clearly indicates the binary was built in a different way will help us handle the version comparison a bit better.
No change in consensus, so accepted. 🎉 This issue now tracks the work of implementing the proposal. — rsc for the proposal review group
Any chance we can get this in 1.19?
@amnonbc the 1.19 freeze began over a month ago, so I'd say it's very late at this point, even if someone had already begun work on it - which doesn't appear to be the case.
It would be nice if it would solve these edge cases:
git tag -sm"v1.2.3" v1.2.3
go install .
app version // outputs (devel) expecting v1.2.3
git tag -sm"v1.2.3" v1.2.3
git push && git push --tags
go install example.com/app@latest
app version // outputs v1.2.3
any chance we can get this in 1.20? Or failing that, in 1.21?
@amnonbc there is no need to ask every cycle. If there was progress, you would see it here. The only other way to make it happen is to volunteer yourself and learn the codebase well.
@amnonbc there is no need to ask every cycle. If there was progress, you would see it here. The only other way to make it happen is to volunteer yourself and learn the codebase well.
Fair enough. I hope to have some time to look at this over the Xmas holiday.
I think my only slight worry then would be: programs wishing to show their information should always print the VCS info (commit, date, dirty, etc) alongside the module version, because not doing so could potentially lead to confusing edge cases for authors such as tags having been added or modified locally.
For me, the VCS info, especially the tag, plus the commit, is the single piece that I‘m looking for in this feature. If the tag is recorded as VCS info or as pseudo version wouldn‘t matter as long as I can retrieve it to remove the need for ldflags. That in turn simplifies the build process and e.g. allows to use static build config with tools like gokrazy.
NOTE: The accepted proposal is https://github.com/golang/go/issues/50603#issuecomment-2181188811.
cmd/go embeds dependency version information in binaries, which is very useful. From Go 1.18 onwards, cmd/go also embeds VCS information in binaries, which makes it even more useful than it was before.
As #37475 mentions, people place version information in binaries using
-ldflags='-X foo=bar'
, which requires an additional build wrapper. The new VCS stamping feature of cmd/go should alleviate the need for external wrapper, but I am afraid it comes short.The version information, in the sense of Go's pseudo version is not recorded for the main module when doing
go build
:The version is recorded as expected when doing
go install
:I am afraid this limitation of cmd/go will continue to force people to use external build wrappers that set
-ldflags
, which is rather unfortunate.I am not the first to want main module version information in binaries, this has been already asked for in various issues, for example in #29814, which was closed as a duplicate of #37475, but it really wasn't a duplicate, as #37475 is about VCS information, and #29814 is about semantic versioning. Other examples of people asking for this feature are mvdan/sh#519 and https://github.com/golang/go/issues/29228#issuecomment-449554128 where various workarounds were proposed.
Speaking of workarounds, the only workaround that I know that currently works would be to create a local module proxy and pass
GOPROXY
togo install
, but that is an extremely high-overhead workaround, andgo install
is not a replacement forgo build
anyway, sincego install
comes with some rather severe limitations regarding how vendoring works and what you can put in go.mod, andgo install
doesn't support controllingGOBIN
when cross-compiling.I realize that Git tags are a local concept, and by doing the "wrong" git operations one could come up with a different pseudo-version for the same source code. I am afraid I don't have any solution or suggestion regarding this git misfeature, except to note that even in this case the hash information is recorded correctly, and in every case by the virtue of having access to the local source code the programmer can always do some local operation that has the potential to cause a version mislabeling. Git is just more prome to do this by accident, but the ability is there, always.
I don't have any stats to back this up, but from my experience most corporate source code is built by
go build
, notgo install
, and it would be great if somehow Go's notion of versioning would be stamped bygo build
.CC @bcmills @mvdan @rsc