golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.16k stars 17.56k forks source link

proposal: cmd/go: make major versions optional in import paths #44550

Closed pkieltyka closed 3 years ago

pkieltyka commented 3 years ago

Semantic Import Versioning (SIV) is a novel idea for supporting multiple versions of a package within the same program. To my knowledge and experience, it's the first example of a strategy for supporting multiple versions of a project / dependency in the same application. More importantly though, its clever introductory design allowed it to offer multi-versioned packages in a Go program while maintaining Go's compatibility guarantee.

Multi-versioned packages in a single program can be quite powerful -- for instance, imagine a Web service where you'd like to maintain backwards compatible support for your consumers, you can simply use an older import path from the same origin repository and versioning control, and quite elegantly continue to support those older API versions.

Although SIV may be an elegant solution in the scenario described above, it also adds unnecessary complexity, cost, code noise, discoverability and ergonomics for the majority of packages (publicly and privately) which may not ever have a mutli-version requirement (I'd argue most packages, and simply we can look to other ecosystems to see this is true). I am sure the Go team has heard a lot of feedback on the friction of SIV. https://twitter.com/peterbourgon/status/1236657048714182657?s=21 and https://peter.bourgon.org/blog/2020/09/14/siv-is-unsound.html offers some excellent points as well.

Clearly there is a case for SIV as an elegant solution for supporting multiple versions of a package in a single application, and there is also a strong case to make SIV optional.

It's clear to me there is a design trade-off at hand, and there is no single correct answer. As I consider the 80/20 rule in making an architectural decision between two trade-offs of capability and usability, I prefer to go with the 80% case so long as it doesn't forego the 20% ability. Which design is the best to optimize for if we can still support both? In the case with Go today, its not possible to opt-out of SIV, or opt-into SIV -- I believe both approaches can yield a happy solution. If we were starting from the beginning, I'd suggest to have SIV be opt-in, but maybe at this point its better for it to be an opt-out design to maintain backwards compatibility with history.


I'd like to propose a path to make SIV opt-out at the level of an application developer consuming a package, while being backwards compatible with current packages and tools.

I'd like to use https://github.com/go-chi/chi as an example for this proposal which adopted semver ahead of Go modules and SIV, and is built for developer simplicity and ergonomics intended for pro Go developers, but also making it familiar and accessible for developers who are new to Go -- these are my design goals for chi as an author and maintainer as started back in 2017. My present goal is to release Chi v5 without a SIV requirement and the only way I can do so is with the proposal below:


Proposal, by example:

github.com/go-chi/chi/go.mod:

module github.com/go-chi/chi/v5

go 1.16

then, git tag chi as v5.0.0 to make the release.

Application developers may consume the package via go get github.com/go-chi/chi@latest or with @v5 or @v5.0.0 and the expected import path will be "github.com/go-chi/chi", however "github.com/go-chi/chi/v5" import path would also be valid and usable.

In the above case, we're specifying the go.mod as expected with current behaviour with SIV from a library perspective. However, from the application perspective when fetching or consuming the library, I may opt-out of the "/v5" suffix in the import path and only adopt it in the scenario when I'd like to support "/v5" and "/v4" (or some other prior version), where I require the handling of multiple versions simultaneously in my program.

I believe the implementation of the above to be backwards compatible as developers would continue to use "github.com/go-chi/chi/v5" with older version of Go as SIV is implied, but optionally developers could make their choice of multiple-version support for the package by handling the import paths themselves and import "github.com/go-chi/chi" to utilize v5.x.x as specified by go.mod.

I believe changes to the Go toolchain for such support would be minimal and would be isolated to the components which build module lists.

Thank you for reading my proposal, and its consideration.

Merovius commented 3 years ago

But even if A depends on B only transitively/indirectly, B will still appear in A's go.mod, right?

Does it? I'm genuinely not sure. I know there is an // indirect comment, but I thought it is only used to manually push the minimum required version above what MVS would imply. I don't think you need to mention something you don't require explicitly in any case - that would seem pointless.

My point with this point was to address the situation where two different major versions of B would be listed in a go.mod — at most one could claim the unversioned module identifier.

Right. I would argue in that case, neither should. If you are actually using two different major versions, you should have to be unambiguous about which one you are referring to, for clarity.

peterbourgon commented 3 years ago

I would argue in that case, neither should. If you are actually using two different major versions, you should have to be unambiguous about which one you are referring to, for clarity.

I wrote it this way initially, so fine by me.

thepudds commented 3 years ago

Hi @peterbourgon

New versions of Go routinely introduce features that make them non-backwards-compatible. For example, a project using package embed compiles with Go 1.16 but not with Go 1.15. So I don't see this as a problem.

There is a key difference, though, which is that Go 1.15 will likely emit an understandable compile error and most likely a note: module requires Go 1.16 message if a Go 1.15 toolchain tries to consume a Go 1.16 module that is using package embed.

Similarly to @pkieltyka's proposal, your proposal as written in your comment above I think falls into the camp where there are scenarios a Go 1.16 consumer could unexpectedly see downgraded major versions (e.g., a v1 instead of a v3) when attempting to consume a Go 1.17 module under your proposal as written there. My guess is that would be a non-starter for a proposal to be implemented.

It might be that your proposal is intentionally still incomplete, which would of course be fine, but if so, it would be worth calling out that the compatibility & transition plan is still a future exercise.

(And as I mentioned in https://github.com/golang/go/issues/44550#issuecomment-787451360 and in my follow-on comments on @pkieltyka's proposal, I don't currently see how to do it cleanly in a single step, though as I said a multi-step transition seems plausible).

peterbourgon commented 3 years ago

@thepudds

there are scenarios a Go 1.16 consumer could unexpectedly see downgraded major versions (e.g., a v1 instead of a v3) when attempting to consume a Go 1.17 module under your proposal as written there.

I don't think so. If a Go 1.16 consumer took a Go 1.17 go.mod, where an unversioned module identifier was associated with e.g. v3.0.0, it would be rejected, as Go 1.16 and earlier only permits the unversioned module identifier to be associated with major version 0 or 1.

More generally, these changes would be gated behind a Go version declaration of e.g. go 1.17 in the go.mod. If necessary, and as a sort of "last resort", earlier versions of Go could have changes cherry-picked so that they understood the minimum viable context necessary to parse and at least reject such modules.

thepudds commented 3 years ago

Hi @peterbourgon

I don't think so.

Apologies in advance if I have misunderstood, but as currently written in your main proposal comment above, it seems a future module foo could have in its go.mod:

module foo 

go 1.17

require github.com/peterbourgon/ff/v3 v3.0.0

and foo could import ff as:

import "github.com/peterbourgon/ff"

As far as I understood, that seems allowed under your current proposal writeup.

And as written, a Go 1.16 consumer of foo would see foo's go.mod as incomplete, and fall back to github.com/peterbourgon/ff@latest, which is a v1 module.

More generally, these changes would be gated behind a Go version declaration of e.g. go 1.17 in the go.mod. If necessary, and as a sort of "last resort", earlier versions of Go could have changes cherry-picked so that they understood the minimum viable context necessary to parse and at least reject such modules.

Teaching some set of toolchain versions prior to Go 1.N to reject any go 1.N or higher go.mod is one of the options I mentioned in https://github.com/golang/go/issues/44550#issuecomment-787451360 as a way to do a transition, and I think that is a plausible approach. (Note though that that is not the current implementation, and hence is worthy of including in a proposal if you are suggesting that implementation should change).

peterbourgon commented 3 years ago

@thepudds

It seems a future module foo could have in its go.mod

module foo 

go 1.17

require github.com/peterbourgon/ff/v3 v3.0.0

Yes, but that would mean it would have to import the module as github.com/peterbourgon/ff/v3, same as today. What I'm suggesting to allow would be

module foo

go 1.17

require github.com/peterbourgon/ff v3.0.0

And as written, a Go 1.16 consumer of foo would see foo's go.mod as incomplete, and fall back to github.com/peterbourgon/ff@latest, which is a v1 module.

Given the above, it would see foo's go.mod as invalid, as to Go 1.16, github.com/peterbourgon/ff must necessarily be associated with major version 0 or 1 exclusively.

Does that clear things up?

thepudds commented 3 years ago

Hi @peterbourgon

Yes, but that would mean it would have to import the module as github.com/peterbourgon/ff/v3, same as today.

If that is a requirement, I don't see that requirement in your main proposal comment above.

And sorry if I am misreading. I am not trying to be dense or obtuse. 😅

bcmills commented 3 years ago

@nemith

I would rather see it be explicit in go.mod what version you want the "unversioned" import to mean. If nothing defined in go.mod then it means the existing behavior. If you overwrite it you can make the unversioned import port to any version. … This is module local only and doesn't affect other modules.

That should be possible using a replace directive once replacement aliasing (#26904) is implemented. I plan to implement replacement aliasing for Go 1.17.

That would look like:


require (
    example.com/m/v2 v2.0.0
)

replace (
    example.com/m => example.com/m/v2
)

However, I would advise against that as a standard practice: it would result in the code in your module being broken whenever a downstream consumer of your module tries to build it without that replace directive, making the replace directive itself viral (and tedious to propagate).

bcmills commented 3 years ago

@peterbourgon

Given module A with require example.com/foo v1.0.0, and module B with require example.com/foo v2.0.0, then of course the unversioned identifier is "local" to those modules only. But if some module C imports both A and B, then we're at the situation I describe in point 6. Module C now depends on two different major versions of foo, which is still allowed, but must be disambiguated by giving at least one of them a SIV-style major version suffix.

If I understand correctly, that leads to a “cliff” in the development cycle that does not directly correspond to the major-version break: as soon as you need to import both v1 and v2, you have to change the import statements regardless.

But I don't think that sort of conditional coexistence is viable anyway. Commands like go list -m all do not examine the package import graph — what would happen if you had one dependency that had require github.com/peterbourgon/ff v1.0.0, and another dependency with require github.com/peterbourgon/ff v3.0.1, and then ran go list -m github.com/peterbourgon/ff?

(Or, if that situation is disallowed due to ambiguity — then what would happen if you already had one transitive dependency that uses github.com/peterbourgon/ff v1.0.0, and attempted to go get another transitive dependency that uses github.com/peterbourgon/ff v3.0.1?)

peterbourgon commented 3 years ago

@bcmills

As soon as you need to import both v1 and v2, you have to change the import statements . . .

Yep. No problem.

What would happen if you had one dependency that had require github.com/peterbourgon/ff v1.0.0, and another dependency with require github.com/peterbourgon/ff v3.0.1, and then ran go list -m github.com/peterbourgon/ff?

The unversioned module identifier github.com/peterbourgon/ff would be understood as an alias to a SIV-compliant versioned module identifier, and that mapping would always be defined by the acting go.mod. It's pure porcelain, and doesn't actually exist in the sense that it could be directly e.g. go listed. In your example, the go command would first try to resolve the unversioned module identifier to a versioned module identifier (and specific version), and, presumably finding no entry for it in the local module's go.mod, would fail.

Or, if that situation is disallowed due to ambiguity — then what would happen if you already had one transitive dependency that uses github.com/peterbourgon/ff v1.0.0, and attempted to go get another transitive dependency that uses github.com/peterbourgon/ff v3.0.1?)

The association of the unversioned module identifier to a versioned module identifier would be "module local", so this doesn't present a problem. It may require more information to be retained from go.mods in the dependency tree.

bcmills commented 3 years ago

@pkieltyka, re https://github.com/golang/go/issues/44550#issuecomment-787455352:

If I understand correctly from your diagram, the meaning in a .go source file of the statement:

import "gh.com/org/a"

would vary based on whether the corresponding go.mod file does or does not explicitly mention gh.com/org/a..?

If so, that would violate some important invariants that have so far held for Go modules to date. Namely:

  1. For any source file within a module that is a dependency of another (“main”) module, the user can determine the meaning of every import statement within the source file using only that source file and the list of dependencies of the main module.

    • For example, go list -f '{{with .Module}}{{.Path}} {{.Version}}{{end}}' gh.com/org/a reliably reports the module containing the package imported by import "gh.com/org/a", regardless of where that import occurs.
  2. As a corollary of the above, the main module can influence the meaning of any import statement within any transitive dependency by adding a require directive (to upgrade the corresponding dependency) and/or a replace directive (to replace the source code and/or transitive dependencies of the selected version).

    • ...and does not need to change the import statements within the main module in order to do so.
  3. A replace directive in the main module applies equally to all dependencies of the main module.

I don't think that the convenience of omitting the /vN import-path suffix is worth the (IMO high) cost of losing those invariants.

In particular, I think it would be especially confusing if a directive in the main module of the form

replace gh.com/org/a => example.com/fork-of-a

affected some imports of the path gh.com/org/a but not others.

peterbourgon commented 3 years ago

For any source file within a module that is a dependency of another (“main”) module, the user can determine the meaning of every import statement within the source file using only that source file and the list of dependencies of the main module.

Yep, this would no longer be the case — bringing Go in line with ~every other modern language in this regard, I suppose.

The main module can influence the meaning of any import statement within any transitive dependency by adding a require directive (to upgrade the corresponding dependency) and/or a replace directive (to replace the source code and/or transitive dependencies of the selected version).

The proposal would add a new dimension of complexity to module identifiers and therefore import statements: unversioned module identifiers would become the porcelain to versioned module identifiers' plumbing. It would make sense to me, as a general rule, if porcelain identifiers were incompatible with existing operations like these, which expect plumbing identifiers.

Concretely, re: your replace statement, I agree that having it be transitive would be confusing. My intuition is that it should either be rejected outright, or apply only to the local module. If you want a specific major version of a dependency to be replaced transitively (i.e. the current behavior) my intuition is that you'd need to specify the plumbing identifier in go.mod, even if you refer to it locally by a porcelain name. Tooling could help with this, if necessary.

docmerlin commented 3 years ago

Someone earlier mentioned the costs of making breaking changes. The problem with breaking changes are their social cost, not their technical cost. The cost is that it takes people to realize something went wrong and deal with it when upgrading etc.

SIV increases the social cost of making a breaking change... however, because versioning was built into go... the technical/effort cost to make a breaking change went down.

Before go.mod breaking changes were rare, now they are common, but the actual harm for a breaking change... the social cost of a breaking change has gone up.

peterbourgon commented 3 years ago

Would a fully-realized proposal that made SIV optional be accepted?

Are there obvious problems with the approach described here?

mvdan commented 3 years ago

Beware that the maintainers might not reply for another week: https://groups.google.com/g/golang-dev/c/onqurcX6pV8

Would a fully-realized proposal that made SIV optional be accepted?

I think a separate and more concrete proposal would help greatly. This thread has grown to a size past what GitHub can handle, and the original proposal was lacking some detail. It's also carrying emoji reactions that don't reflect the latest iteration.

Are there obvious problems with the approach described here?

The only bit that stands out to me is the scope: it seems to me like we would get most of the benefit here by making the /vN suffix optional in Go imports alone, and keep them explicit in go.mod. go.mod files aren't meant to be written by humans directly, so if you wanted to make it easier for a user to add module A at version v3 without knowing what the right SIV suffix is, you could always make that easier via go get. There are also tools that consume and understand go.mod files, so keeping them explicit would reduce the amount of breaking changes to be rolled out.

Besides that, I generally think there's merit to this idea of allowing SIV-less Go imports as long as there's only one version of that module in the current go.mod.

I also think it has drawbacks as mentioned in https://github.com/golang/go/issues/44550#issuecomment-788081993, so the decision might ultimately come down to evaluating the tradeoff. I'd lean in favor of the tradeoff being worth it, but I don't think anyone can say that the proposal would be accepted.

peterbourgon commented 3 years ago

Thanks for the reply!

it seems to me like we would get most of the benefit here by making the /vN suffix optional in Go imports alone, and keep them explicit in go.mod

I don't think it's necessary, as the version suffix would be a deterministic function of the version. Am I forgetting something?

I also think it has drawbacks as mentioned in . . .

I don't believe these are drawbacks, they are just properties of the proposal.

mvdan commented 3 years ago

I don't think it's necessary, as the version suffix would be a deterministic function of the version. Am I forgetting something?

You're right that keeping go.mod in its current explicit form isn't necessary, but I don't think it's worth changing its semantics. I don't see a clear benefit, given that humans should generally not be editing or writing those files directly. And I see clear drawbacks: it would break more tools and programs (those that look at go.mod files in isolation, such as pkgsite), and it would likely increase the amount of edge cases, as now we'd have to also consider the effect of optional SIV on replace directives, exclude directives, module lines, retractions, etc.

I don't believe these are drawbacks, they are just properties of the proposal.

We could argue about that, but at least we can all agree that making SIV optional will add some ambiguity and edge cases :) It's not just a clean net benefit, as otherwise it would likely have existed from the start.

peterbourgon commented 3 years ago

It's not just a clean net benefit, as otherwise it would likely have existed from the start.

Well, this is actually my question — is the concept of optional SIV acceptable to the maintainers?

mvdan commented 3 years ago

I think you'll only get that answer with a reasonably detailed proposal. This thread contains multiple descriptions of what "optional SIV" means. I get not wanting to invest more time into a proposal if it might not be accepted, but that applies to nearly all proposals - nearly half of mine have been rejections, but that still doesn't make them time wasted :)

peterbourgon commented 3 years ago

"Maybe" is a fine answer. "Categorically no" is also possible, though, and I'd like to know that much at least.

itsjamie commented 3 years ago

@mvdan I took your advice and opened #47034 to provide a formal concrete definition for discussion.

theckman commented 3 years ago

That other proposal by @itsjamie was closed: https://github.com/golang/go/issues/47034#issuecomment-874240760

I'm retracting this because an issue that I believed was true within a definition of optional SIV inside the original ticket is untrue.

A user has defined the naked import in their go.mod, the rules as laid out here; #44550 (comment) would resolve in a non-breaking way.

Therefore my thought that a necessity for an optional mode is unfounded.

rsc commented 3 years ago

Thanks for the respectful discussion everyone. I wasn't able to read this thread in February as it was happening—this is fundamental to large-scale open source: you can't keep up with everyone all the time!—but essentially all the points I would have raised were raised and discussed. This reply aims to restate and highlight what I think the most important parts are.

At its essence, the proposal here is to provide some way to allow a Go source file to say import "m" and have it interpreted by the toolchain as import "m/v2".

There was a lot of good discussion (and some disagreement) about exactly what would and would not constitute a breaking change to enable that, and how best to manage a transition to a world where this feature exists. I'm going to leave all that to the side and focus on whether we want to be in that world or not.

Before I get to the details of the proposed change, some background for context. For even more detail, see my post “The Principles of Versioning in Go” (also available in video form).

Dependencies and Go

One of the core motivations for starting Go back in 2007 was to handle dependencies well. What shipped in 2009 was a compiler and linker, a few libraries, and some makefiles. We had not yet made it to the software dependency part of the vision. But we did in the next couple years, and we've spent the last decade continuing to improve and refine it.

It's not possible to separate the Go language from its approach to software dependencies (as suggested here). The fact is that Go is an entire programming environment, not just a language, and handling dependencies well is one of the core reasons we created Go.

Beyond the original motivation, agreement on core concepts is fundamental to the overall health and power of the Go ecosystem. Some of those concepts are in the language, like only having UTF-8 source files. Some are in the libraries, like io.Reader and http.Handler. Some are in the Go command, like having one package per directory, the interpretation of //go:build lines, and so on. The general approach to interpreting dependency management, including resolving imports, deciding versions, and so on, is also on that list. The enforced agreement lets the ecosystem treat those things as a settled foundation and build new structures on top. The ability to have alternate foundations would make all the things built on top much less stable. In the worst case it bifurcates the worlds, like when we had dep and glide and the others and they couldn't consume each other’s dependency information.

We will continue to treat managing dependencies as a fundamental, inseparable part of the Go environment.

Precise import paths

Goinstall introduced the current URL-like import paths. At first, some people didn't like them, since they were longer than what we'd been writing until that point. We could have instead used some kind of indirection, writing uuid = github.com/google/uuid in go.mod (or somewhere else) and then writing import "uuid" in source files. Over time, I believe most of us have come to value the clarity of having the full path visible at import, and tools like goimports and gopls have relieved us from having to type that detail ourselves. The full import paths make individual source files self-contained as far as their dependencies. As a result, it is easy to know exactly what an import means, as well as easy to copy and paste code from one package to another, or from examples.

Other languages have taken different paths. The indirection I just described is almost exactly what Rust does, for example. That's fine: there is no reason all the languages in the world have to converge, and if we wanted Go to be exactly like other languages we could have not bothered to create it in the first place and kept using other languages.

It may be true that every problem in computer science can be fixed by adding a layer of indirection, but then it is also true that every layer of indirection creates a new problem. Go import paths are more useful for readers precisely because there isn't any indirection: they are what they say they are.

We have some direct experience with ambiguous import paths. In 2015 we introduced support for vendor directories, which helped us learn many important lessons, one of which was the importance of unambiguous import paths. In that vendoring setup, import "p" inside the package a/b/c/d meant a/b/c/d/vendor/p if that existed, or else a/b/c/vendor/p, a/b/vendor/p, or a/vendor/p if any of those existed, or else plain p. The result was that when you typed an import path like p, it had no clear meaning. Import paths appear in many places besides import declarations in Go source files; for example, they appear on the go command line, in test output, and the like. A source file you were debugging might say import "p", but then go doc p or go test p would fail: you really needed to say go test a/b/vendor/p. Or maybe it wouldn't fail, which is worse. Maybe you'd get the real p and be confused, not realizing the import referred to a/b/vendor/p.

Confirming that “you don't know what you got 'til it's gone,” the vendor experiment helped us relearn the importance of precise, unambiguous import paths. If an import path changes meaning depending on which directory it appears in, then you have to change all your tooling to always carry around (import path, directory) pairs to get back to a precise identifier for a package. And you have to change your programmers too: everyone has to learn that it matters what directory they are in when they utter an import path. Command lines and scripts and mental models have to change. Vendoring was just one broken thing after another.

One of the advances of modules was to return to precise import paths by eliminating this directory-sensitive import path resolution. Today, vendoring is limited to a single, top-level vendor directory, so as long as you know what the main (top-level) module is, that determines the meaning of any import path involved in any source code built for that module. It's true that if you go to a different top-level module then the meaning of import paths changes, but that is far easier to explain and also completely unavoidable if you want to be able to work on multiple projects and have the same builds as your collaborators. Within a given main module, if you see import "p" anywhere, even deep in a dependency, you can be sure that go test p tests that package, go doc p shows docs for that package, and so on. Everything is simpler because there is no ambiguity about what p means.

Multiple major versions

Another advance of modules was to allow incompatible versions of a package to be included in a single build. As a technical detail, this removes diamond dependencies and makes the version selection problem no longer NP-complete. As a practical detail, this removes significant conflicts when building large-scale software. The example in “Semantic Import Versioning” is not purely hypothetical: it is motivated by real problems we encountered in real open-source software.

Peter Bourgon argued that semantic import versioning solves a non-problem:

I have personally had what I consider to be substantial exposure to an enormous amount of Go code, due to my position in the OSS ecosystem, as well as my consulting work. That exposure includes, importantly, a huge amount of code maintained in private repositories. I have no way of knowing, but I suspect the only person in the Go community who may have seen more Go code than I have is Bill Kennedy. With that context, I can state without hesitation that the need to include two major versions of the same dependency in one compilation unit is extraordinarily rare, and in those rare circumstances that it does arise, it is almost always due to pathological conditions in the dependency graph.

This claim about needing multiple major versions being “extremely rare” is easily contradicted. Kubernetes depends on multiple major versions of github.com/russross/blackfriday and of gopkg.in/yaml. Even Peter's own Go-kit depends on both go.etcd.io/etcd/client/v2 and go.etcd.io/etcd/client/v3. There are two different major versions of Etcd, and Go-kit provides adapters for both. Any given program is probably only going to need one, but the module needs to be able to provide both. Even a single binary might want to include both, selecting one or the other at run time. None of this seems “pathological.” In fact, it aligns with the example from “Semantic Import Versioning.”

If the need to include two major versions of the same dependency seems rare, the only explanation I see is that Go modules makes this case work so well that people don't notice when it happens.

Major versions in import paths

If we want both precise import paths and multiple, incompatible major versions of a package in a build, then the logical implication is that the major version must be captured in the import path. Hence semantic import versioning.

There was some discussion here about a philosophy of breaking changes and whether we want to encourage them or not. I personally agree with essentially everything Rich Hickey said in his Spec-ulation talk and that Linus Torvalds said in his email rant. I also like ulikunitz's phrasing that “Making it costly to break compatibility is a feature not a bug.” But that's not directly relevant here.

If you want both precise import paths and the ability to use multiple major versions of a package in a build, then including major versions in import paths is implied as a requirement, no matter what you believe about the wisdom of inflicting breaking changes on users, or about whether or not breaking changes are always costly. So we can set that philosophical discussion aside.

Optional version elements in import paths

Phew.

On to the idea of providing some way to allow a Go source file to say import "m" and have it interpreted by the toolchain as import "m/v2". Clearly, in terms of the discussion above, this would make import paths imprecise once again when more than one major version is present in the dependency graph, leading to all the problems we had with fine-grained vendoring in 2015. There would need to be some significant benefit to repay that cost.

One possible argument is that the cost of the ambiguity is very low, because multiple major versions in a dependency graph essentially never arise in practice. The rareness argument would cut both ways, since it could easily lead to tools that work most of the time and then break in rare, unanticipated cases. But as we saw above, using multiple major versions in a given module is not rare. In fact, it happens more than even experienced Go users like Peter notice.

Another possible argument, made twice above, is that other languages don't put versions in import paths, so Go shouldn't either. This argument doesn't make sense, since Go has never aimed to be the same as other languages. It would also justify making many a wide variety of unpalatable changes to Go that we clearly wouldn't. In particular, it would equally justify removing the fully-qualified import paths from source files entirely, and there's just no way we would reverse that decision: we appreciate too well the benefits they bring. Including the major version in the import path is vital to preserving the same benefits.

If there were some way to preserve all the benefits of the current system and make it easier to update a dependency from one major version to another, we would absolutely want to pursue that. If someone has a flash of insight for how, that'd be great. But making the major version implicit would abandon important benefits, so that seems untenable.

Tooling for upgrades

On the other hand, better tooling for updating import statements would absolutely preserve the benefits of the current system, in much the same way that goimports and gopls have for writing import statements in the first place. That still seems like the right approach to “make it easier to consume major version updates,” as Peter nicely stated the goal.

I wish the better tooling already existed. We have been working instead the past couple years on more fundamental tooling (and there was a pandemic), but I am hopeful that Go 1.18 could include a go fix along the lines I sketched back in 2018. If not much has changed from, say, v2 to v3, and the module author takes the time to write a final v2 that wraps v3, then go fix could update users of v2 in a safe way.

It would also be easy to write a tool (or even a shell script) that updates all the imports in a module from v2 to v3, runs all the tests, and commits the result if the tests pass. The fact that this is so easily written cuts against much more costly solutions like removing the major version from import paths and giving up precise import paths.

It's worth pointing out that the upgrade shell script is much less safe than the go fix, since it just blindly assumes that v3 and v2 are similar enough to just see if it works to drop v3 in for v2. If the breaking change in v3 is removal of some hardly-ever-used function, then that will be fine. But if the breaking change is a subtle semantic one that the tests weren't looking for, then the blind upgrade may go badly wrong once it hits production. In the case where v3 and v2 are essentially compatible, the module author publishing a final v2 whose implementation wraps v3 makes that very clear to the tooling: using that last version of v2 is exactly the same as using v3, so go fix can be confident it's not going to change the meaning of the program by inlining those wrappers. Equally important, when the breaking change is a subtle semantic one, the module authors wouldn't publish a v2 implementation wrapping v3, at least not for that part of the API. So go fix wouldn't touch those calls that would break. The upgrade shell script is more of a yolo approach: it doesn't have the same signals from the module author about what should and shouldn't be upgraded, so we can't have the same confidence in its changes.

A second reason against optional version elements in import paths, beyond the loss of precise import paths, is that it creates a well-lit path toward exactly this kind of unprincipled yolo upgrade. I strongly believe the more principled go fix path will serve us better. But even that belief is secondary to not losing precise import paths.

I appreciate and sympathize with thrawn01's comment about the pain of updating examples and other documentation for introducing mailgun-go v4. If the vast majority of the API is not changing, so that the only update is the import path change, then as others have said, it typically works out better to deprecate some part instead of introducing a whole new semantic version. The argument also works the other way, though: when the API is completely different, the different import path helps avoid invalidating old examples that people might have saved, not to mention their old working code. In fact, I’d expect any documentation to make clear somewhere what version of the API it is written for, and that’s fundamentally going to need updating however it is expressed.

Company-internal usage and making breaking changes

I also appreciate Peter's observation that Go usage inside a company can be quite different from public, open-source Go usage. Even so, there is far more open-source Go code than Go code in any one company, and the Go ecosystem is larger than any one company. When there is an unavoidable conflict, we are always going to prioritize the health and functioning of the overall Go ecosystem.

I don't believe there is an unavoidable conflict in this case though. Inside a company, where all the users of a given module can be found, there are a few different approaches that can be used. I want to look at these in the context of bytheway's comment:

We've faced similar challenges and decisions with internal libraries at my company. What could be a quick communication about an incompatibility and a one-line update to the go.mod to get the latest package can quickly turn into a large change set, touching many, many files.

It depends on the details of the incompatiblity, but usually the best approach is:

See my article “Codebase Refactoring (with help from Go)” for a detailed presentation of these steps. If you're working in open source, you usually can't do that last step, because you can't find all the uses of the old function, so you leave it and mark it deprecated. But the general approach works well, and we do it all the time (just git grep Deprecated: in the Go repo). Inside a company, you can often tell when there are no uses of the old function left and then delete it, which is fantastic.

The benefits of this approach arise from the fact that it captures semantic changes at a very fine-grained level:

Contrast that approach with redefining the name in place, bumping the major version, and updating go.mods in dependent modules:

It's true that semantic import versioning adds:

This is hardly the worst problem with this approach.

What about at small scales, where you know there's only one or two users of the function and you can talk to them ahead of time? In that case I'd argue that doing that and then just making the change is not a breaking change, provided you are absolutely sure that's everyone who is affected. If you're not sure, then renaming the function helps find the ones you missed. In contrast, bumping the major version doesn't really help identify any new affected parties. It's just a way to disclaim responsibility for the breakage.

My point here is not to say how you should or should not make changes at your company. You should do what you believe works for you. But if the scenario is to be used to justify a change that affects the entire Go ecosystem, like giving up semantically precise import paths, then it is absolutely fair to compare it to alternatives and explore how eliminating the import path rewrites would change the scenario. The answer: there is a better alternative available today the vast majority of the time, and eliminating the import path rewrite doesn't substantially improve the situation. It removes a rote mechanical step and leaves the dangerous flaws.

And of course, if you are aware of the risks of dropping the major versions but still decide that for your company repos you want to standardize on versionless imports with no possibility of multiple major versions for your own repos, then that's easy to do by pairing each require with a replace statement in go.mod, as Bryan Mills explained. That solution doesn't require us to change the standard tooling either.

A Path Forward?

Peter asked, “is the concept of optional SIV acceptable to the maintainers?” As currently envisioned, where it would mean we lose precise import paths, no, I don't think that would be acceptable: it means giving up too many of the benefits of the current system. I would rather find a less costly solution to the underlying problem that this proposal means to solve. Tooling looks to be that answer. I encourage people who are interested in this to explore writing such tools and learn what works and what does not.

I would refer anyone who wants even more detail to “The Principles of Versioning in Go” (also available in video form), especially the objections sections at the end.

thepudds commented 3 years ago

Hi Russ, thanks for the detailed comments.

For me, better tooling does help, and it will make a material difference when that better tooling lives in cmd/go, and automatic API migration helps with the cost/benefit calculus.

That said, Go code lives in a larger ecosystem of language agnostic tooling, and I would be curious to hear any additional thoughts on @rogpeppe’s comments in https://github.com/golang/go/issues/44550#issuecomment-785227107 about the practical problems arising from large noisy diffs that can potentially cause actual bugs to be overlooked.

(Side note: attempts to go “meta” in open-source conversations often go wrong. I know there is a fair amount of passion on this topic, but I would suggest to everyone that we collectively try to have a thoughtful conversation here, and perhaps from that thoughtful conversation a new solution will emerge…. Finally, on mobile, so apologies for any typos).

theckman commented 3 years ago

@rsc reading your post has me feeling pretty deflated, because there are quite a few salient points we've made that are misrepresented or seem like they may have been completely disregarded. Likewise, I think there may be personal preferences you are communicating as the desired end goal or as a truth, that I'm not sure everyone agrees with. I wonder if we need to identify those foundational misalignments, before we can make forward progress on this topic. To accomplish that, it may need to be a simple bulleted list of statements that we either agree or disagree with.

That said, in the interest of trying to move this issue forward toward a desirable outcome, here's one example:

If we want both precise import paths and multiple, incompatible major versions of a package in a build, then the logical implication is that the major version must be captured in the import path. Hence semantic import versioning.

It's not immediately clear to me who "we" is in this specific sentence, and so my immediate reaction is that it does not include me (and quite a few folks who are commenting in this thread). Based on the context you shared above about precise import paths, I didn't see any compelling indication that the tradeoff we're making is worth the minimal gain. The example of the nested vendor directories was mitigated by other means, which makes it not compelling to me.

If an import path changes meaning depending on which directory it appears in, then you have to change all your tooling to always carry around (import path, directory) pairs to get back to a precise identifier for a package. And you have to change your programmers too: everyone has to learn that it matters what directory they are in when they utter an import path. Command lines and scripts and mental models have to change.

We have to do this today, right?

A Go programmer has to know that they are using v2 of https://github.com/some/project/package, and that all of their references to it need to be named differently on the command line or any scripts you run. They need to know to upgrade not only their Go code when upgrading a package, but also any scripts that are making reference to it. Will go fix handle those too? In my opinion this is a costlier thing to learn because it is unlike the mental model people are going to originally approach the problem with, and is going to result in many disappointing surprises.

To me these problems you state sound like desirable end goals. As a human it's rare that I care about which version I am working with, I only care about the name knowing that version is declared by my project. It would be much more ideal if the tooling carried that weight for me, by maintaining that mapping of name/version based on which folder I'm in.

You as a human also have to do it anyway once you use a replace statement, and so I think there are a few things that may devalue the risk as you've highlighted it.

Go import paths are more useful for readers precisely because there isn't any indirection: they are what they say they are.

Being very candid, if this statement were true this GitHub issue would not exist, because a big portion of it would be solved if v0 and v1 were required.

If there were some way to preserve all the benefits of the current system and make it easier to update a dependency from one major version to another, we would absolutely want to pursue that. If someone has a flash of insight for how, that'd be great. But making the major version implicit would abandon important benefits, so that seems untenable.

While I would be a little frustrated with having to include the version on all of my imports, it would be much better for the Go ecosystem's longterm success if we required v0 and v1 on the import path for all imports compared to making no change. Maybe we should just lean into paying the cost of doing that.

This claim about needing multiple major versions being “extremely rare” is easily contradicted. Kubernetes depends on multiple major versions of github.com/russross/blackfriday and of gopkg.in/yaml.

I'm not sure finding some project, that require multiple versions, "easily contradicts" the statement that the need to use multiple versions is extremely rare. Yes we can find examples of projects who are in that state, but it does not quantify nor qualify the statement. I'd also argue that Kubernetes is an exceptional project, and so it may not be the best avenue to make your point.

It would also be easy to write a tool (or even a shell script) that updates all the imports in a module from v2 to v3, runs all the tests, and commits the result if the tests pass. The fact that this is so easily written cuts against much more costly solutions like removing the major version from import paths and giving up precise import paths.

This completely disregards the feedback by many above, that the cost is more than just in changing the source code. There's a cost associated with the noise it creates in a diff, where the humans need to take the time to review the changes and give the :+1:. There can be a cost associated with merging that code in, so if the diff is larger and it's pending for longer you may have merge conflicts to resolve and then re-review.

Another possible argument, made twice above, is that other languages don't put versions in import paths, so Go shouldn't either.

The point I've been communicating relative to this is not that other languages do it, but that the way we do it violates the mental model of many folks, which results in quite a bit of cognitive dissonance resulting in complex socio-technical problems. An example provided was that other languages do it differently, but it wouldn't be prudent to interpret that as the argument. I hope this clarifies that.

And of course, if you are aware of the risks of dropping the major versions but still decide that for your company repos you want to standardize on versionless imports with no possibility of multiple major versions for your own repos, then that's easy to do by pairing each require with a replace statement in go.mod, as Bryan Mills explained. That solution doesn't require us to change the standard tooling either.

Are you open to replace statements propagating downward to consumers of that module, otherwise this is not something easy to support. If we're open to making those changes (separate from this proposal), then I think that makes for a more compelling argument.

I would rather find a less costly solution to the underlying problem that this proposal means to solve. Tooling looks to be that answer. I encourage people who are interested in this to explore writing such tools and learn what works and what does not.

The core nature of this GH issue is that we've tried to explore this problem with tooling, but came to quickly realize it wasn't a technical problem to be solved but more of a social one. If it were purely technical, I think it would be easy to bolt something on to help address it.

Unfortunately, there are many social-technical problems that cannot be directly solved at face value by implementing a piece of software, but you instead need to solve for the contributing factors that lead to the problem coming to exist in the first place. That's why this issue was raised to request we explore making changes to the Go toolchain itself, so that we can solve the contributing factors of the problem and result in a better outcome.

We will continue to treat managing dependencies as a fundamental, inseparable part of the Go environment.

I think we can absolutely agree on this. Any changes or tools that are made to support this need to exist within the toolchain, and cannot be some third-party thing we expect folks to have to learn about, download, and use. It needs to come out of the box and be part of their initial experience.

thepudds commented 3 years ago

While I would be a little frustrated with having to include the version on all of my imports, it would be much better for the Go ecosystem's longterm success if we required v0 and v1 on the import path for all imports compared to making no change. Maybe we should just lean into paying the cost of doing that.

If major versions became required for v0 and v1, I think that would be a net improvement, and it seems to me it would be possible to transition to that, although it might take multiple steps, e.g., as outlined in https://github.com/golang/go/issues/44550#issuecomment-787451360

If that happens, it would be nice if 'go get foo@latest' and 'require foo latest' could then take on the meaning of asking for the latest major version. With gopls and goimports defaulting to the latest major version, that together would help new and experienced gophers alike avoid accidentally getting the wrong major version, or having to leave their command line world to find out what the latest major version available is. Or, perhaps there could be some alternate spelling of that request.

beoran commented 3 years ago

I really appreciate the importance of exact import paths. But this shows that there is a problem with v0 and v1 import paths which currently do not need to be precise, and have a complex set of rules to resolve them.

While I understand that this was done for backwards compatibility, it creates confusion and makes the users of modules desire to have non-precise import paths for v2+ modules like we see in this feature request.

What I would do if I could start a programming language from scratch would be to require exact import paths with the semantic version even for v0. But with Go, as it stands now this would break too much existing software. We could allow a v0 and v1 suffix on import paths for consistency, but we cannot mandate it

So I guess better tooling that can upgrade between semantic versions semi-automatically is the best we can do. In stead of a "script" the version update tool could work like Mage does and be based on a go source that can be arbitrarily complex. Perhaps we should first try to write such a tool and see if that solves the issue of upgrading from v1 to v2 and beyond.

Op vr 23 jul. 2021 02:32 schreef Russ Cox @.***>:

Thanks for the respectful discussion everyone. I wasn't able to read this thread in February as it was happening—this is fundamental to large-scale open source: you can't keep up with everyone all the time!—but essentially all the points I would have raised were raised and discussed. This reply aims to restate and highlight what I think the most important parts are.

At its essence, the proposal here is to provide some way to allow a Go source file to say import "m" and have it interpreted by the toolchain as import "m/v2".

There was a lot of good discussion (and some disagreement) about exactly what would and would not constitute a breaking change to enable that, and how best to manage a transition to a world where this feature exists. I'm going to leave all that to the side and focus on whether we want to be in that world or not.

Before I get to the details of the proposed change, some background for context. For even more detail, see my post “The Principles of Versioning in Go https://research.swtch.com/vgo-principles” (also available in video form https://www.youtube.com/watch?v=F8nrpe0XWRg). Dependencies and Go

One of the core motivations for starting Go back in 2007 was to handle dependencies well. What shipped in 2009 was a compiler and linker, a few libraries, and some makefiles. We had not yet made it to the software dependency part of the vision. But we did in the next couple years, and we've spent the last decade continuing to improve and refine it.

It's not possible to separate the Go language from its approach to software dependencies (as suggested here https://github.com/golang/go/issues/44550#issuecomment-784544232). The fact is that Go is an entire programming environment, not just a language, and handling dependencies well is one of the core reasons we created Go.

Beyond the original motivation, agreement on core concepts is fundamental to the overall health and power of the Go ecosystem. Some of those concepts are in the language, like only having UTF-8 source files. Some are in the libraries, like io.Reader and http.Handler. Some are in the Go command, like having one package per directory, the interpretation of //go:build lines, and so on. The general approach to interpreting dependency management, including resolving imports, deciding versions, and so on, is also on that list. The enforced agreement lets the ecosystem treat those things as a settled foundation and build new structures on top. The ability to have alternate foundations would make all the things built on top much less stable. In the worst case it bifurcates the worlds, like when we had dep and glide and the others and they couldn't consume each other’s dependency information.

We will continue to treat managing dependencies as a fundamental, inseparable part of the Go environment. Precise import paths

Goinstall https://groups.google.com/d/msg/golang-nuts/8JFwR3ESjjI/cy7qZzN7Lw4J introduced the current URL-like import paths. At first, some people didn't like them, since they were longer than what we'd been writing until that point. We could have instead used some kind of indirection, writing uuid = github.com/google/uuid in go.mod (or somewhere else) and then writing import "uuid" in source files. Over time, I believe most of us have come to value the clarity of having the full path visible at import, and tools like goimports and gopls have relieved us from having to type that detail ourselves. The full import paths make individual source files self-contained as far as their dependencies. As a result, it is easy to know exactly what an import means, as well as easy to copy and paste code from one package to another, or from examples.

Other languages have taken different paths. The indirection I just described is almost exactly what Rust does, for example. That's fine: there is no reason all the languages in the world have to converge, and if we wanted Go to be exactly like other languages we could have not bothered to create it in the first place and kept using other languages.

It may be true that every problem in computer science can be fixed by adding a layer of indirection, but then it is also true that every layer of indirection creates a new problem. Go import paths are more useful for readers precisely because there isn't any indirection: they are what they say they are.

We have some direct experience with ambiguous import paths. In 2015 we introduced support for vendor directories https://golang.org/s/go15vendor, which helped us learn many important lessons, one of which was the importance of unambiguous import paths. In that vendoring setup, import "p" inside the package a/b/c/d meant a/b/c/d/vendor/p if that existed, or else a/b/c/vendor/p, a/b/vendor/p, or a/vendor/p if any of those existed, or else plain p. The result was that when you typed an import path like p, it had no clear meaning. Import paths appear in many places besides import declarations in Go source files; for example, they appear on the go command line, in test output, and the like. A source file you were debugging might say import "p", but then go doc p or go test p would fail: you really needed to say go test a/b/vendor/p. Or maybe it wouldn't fail, which is worse. Maybe you'd get the real p and be confused, not realizing the import referred to a/b/vendor/p.

Confirming that “you don't know what you got 'til it's gone,” the vendor experiment helped us relearn the importance of precise, unambiguous import paths. If an import path changes meaning depending on which directory it appears in, then you have to change all your tooling to always carry around (import path, directory) pairs to get back to a precise identifier for a package. And you have to change your programmers too: everyone has to learn that it matters what directory they are in when they utter an import path. Command lines and scripts and mental models have to change. Vendoring was just one broken thing after another.

One of the advances of modules was to return to precise import paths by eliminating this directory-sensitive import path resolution. Today, vendoring is limited to a single, top-level vendor directory, so as long as you know what the main (top-level) module is, that determines the meaning of any import path involved in any source code built for that module. It's true that if you go to a different top-level module then the meaning of import paths changes, but that is far easier to explain and also completely unavoidable if you want to be able to work on multiple projects and have the same builds as your collaborators. Within a given main module, if you see import "p" anywhere, even deep in a dependency, you can be sure that go test p tests that package, go doc p shows docs for that package, and so on. Everything is simpler because there is no ambiguity about what p means. Multiple major versions

Another advance of modules was to allow incompatible versions of a package to be included in a single build. As a technical detail, this removes diamond dependencies and makes the version selection problem no longer NP-complete https://research.swtch.com/version-sat#alternatives. As a practical detail, this removes significant conflicts when building large-scale software. The example in “Semantic Import Versioning https://research.swtch.com/vgo-import” is not purely hypothetical: it is motivated by real problems we encountered in real open-source software.

Peter Bourgon argued that semantic import versioning solves a non-problem https://github.com/golang/go/issues/44550#issuecomment-784941491:

I have personally had what I consider to be substantial exposure to an enormous amount of Go code, due to my position in the OSS ecosystem, as well as my consulting work. That exposure includes, importantly, a huge amount of code maintained in private repositories. I have no way of knowing, but I suspect the only person in the Go community who may have seen more Go code than I have is Bill Kennedy. With that context, I can state without hesitation that the need to include two major versions of the same dependency in one compilation unit is extraordinarily rare, and in those rare circumstances that it does arise, it is almost always due to pathological conditions in the dependency graph.

This claim about needing multiple major versions being “extremely rare” is easily contradicted. Kubernetes depends on multiple major versions of github.com/russross/blackfriday and of gopkg.in/yaml. Even Peter's own Go-kit depends on both go.etcd.io/etcd/client/v2 and go.etcd.io/etcd/client/v3. There are two different major versions of Etcd, and Go-kit provides adapters for both. Any given program is probably only going to need one, but the module needs to be able to provide both. Even a single binary might want to include both, selecting one or the other at run time. None of this seems “pathological.” In fact, it aligns with the example from “Semantic Import Versioning.”

If the need to include two major versions of the same dependency seems rare, the only explanation I see is that Go modules makes this case work so well that people don't notice when it happens. Major versions in import paths

If we want both precise import paths and multiple, incompatible major versions of a package in a build, then the logical implication is that the major version must be captured in the import path. Hence semantic import versioning.

There was some discussion here about a philosophy of breaking changes and whether we want to encourage them or not. I personally agree with essentially everything Rich Hickey said in his Spec-ulation talk https://youtu.be/oyLBGkS5ICk and that Linus Torvalds said in his email rant https://yarchive.net/comp/linux/gcc_vs_kernel_stability.html. I also like ulikunitz's phrasing https://github.com/golang/go/issues/44550#issuecomment-784498834 that “Making it costly to break compatibility is a feature not a bug.” But that's not directly relevant here.

If you want both precise import paths and the ability to use multiple major versions of a package in a build, then including major versions in import paths is implied as a requirement, no matter what you believe about the wisdom of inflicting breaking changes on users, or about whether https://github.com/golang/go/issues/44550#issuecomment-784534989 or not https://github.com/golang/go/issues/44550#issuecomment-784544232 breaking changes are always costly. So we can set that philosophical discussion aside. Optional version elements in import paths

Phew.

On to the idea of providing some way to allow a Go source file to say import "m" and have it interpreted by the toolchain as import "m/v2". Clearly, in terms of the discussion above, this would make import paths imprecise once again when more than one major version is present in the dependency graph, leading to all the problems we had with fine-grained vendoring in

  1. There would need to be some significant benefit to repay that cost.

One possible argument is that the cost of the ambiguity is very low, because multiple major versions in a dependency graph essentially never arise in practice. The rareness argument would cut both ways, since it could easily lead to tools that work most of the time and then break in rare, unanticipated cases. But as we saw above, using multiple major versions in a given module is not rare. In fact, it happens more than even experienced Go users like Peter notice.

Another possible argument, made twice above, is that other languages don't put versions in import paths, so Go shouldn't either. This argument doesn't make sense, since Go has never aimed to be the same as other languages. It would also justify making many a wide variety of unpalatable changes to Go that we clearly wouldn't. In particular, it would equally justify removing the fully-qualified import paths from source files entirely, and there's just no way we would reverse that decision: we appreciate too well the benefits they bring. Including the major version in the import path is vital to preserving the same benefits.

If there were some way to preserve all the benefits of the current system and make it easier to update a dependency from one major version to another, we would absolutely want to pursue that. If someone has a flash of insight for how, that'd be great. But making the major version implicit would abandon important benefits, so that seems untenable. Tooling for upgrades

On the other hand, better tooling for updating import statements would absolutely preserve the benefits of the current system, in much the same way that goimports and gopls have for writing import statements in the first place. That still seems like the right approach to “make it easier to consume major version updates,” as Peter nicely stated the goal.

I wish the better tooling already existed. We have been working instead the past couple years on more fundamental tooling (and there was a pandemic), but I am hopeful that Go 1.18 could include a go fix along the lines I sketched back in 2018 https://research.swtch.com/vgo-import#automatic_api_updates. If not much has changed from, say, v2 to v3, and the module author takes the time to write a final v2 that wraps v3, then go fix could update users of v2 in a safe way.

It would also be easy to write a tool (or even a shell script) that updates all the imports in a module from v2 to v3, runs all the tests, and commits the result if the tests pass. The fact that this is so easily written cuts against much more costly solutions like removing the major version from import paths and giving up precise import paths.

It's worth pointing out that the upgrade shell script is much less safe than the go fix, since it just blindly assumes that v3 and v2 are similar enough to just see if it works to drop v3 in for v2. If the breaking change in v3 is removal of some hardly-ever-used function, then that will be fine. But if the breaking change is a subtle semantic one that the tests weren't looking for, then the blind upgrade may go badly wrong once it hits production. In the case where v3 and v2 are essentially compatible, the module author publishing a final v2 whose implementation wraps v3 makes that very clear to the tooling: using that last version of v2 is exactly the same as using v3, so go fix can be confident it's not going to change the meaning of the program by inlining those wrappers. Equally important, when the breaking change is a subtle semantic one, the module authors wouldn't publish a v2 implementation wrapping v3, at least not for that part of the API. So go fix wouldn't touch those calls that would break. The upgrade shell script is more of a yolo approach: it doesn't have the same signals from the module author about what should and shouldn't be upgraded, so we can't have the same confidence in its changes.

A second reason against optional version elements in import paths, beyond the loss of precise import paths, is that it creates a well-lit path toward exactly this kind of unprincipled yolo upgrade. I strongly believe the more principled go fix path will serve us better. But even that belief is secondary to not losing precise import paths.

I appreciate and sympathize with thrawn01's comment https://github.com/golang/go/issues/44550#issuecomment-784429345 about the pain of updating examples and other documentation for introducing mailgun-go v4. If the vast majority of the API is not changing, so that the only update is the import path change, then as others have said, it typically works out better to deprecate some part instead of introducing a whole new semantic version. The argument also works the other way, though: when the API is completely different, the different import path helps avoid invalidating old examples that people might have saved, not to mention their old working code. In fact, I’d expect any documentation to make clear somewhere what version of the API it is written for, and that’s fundamentally going to need updating however it is expressed. Company-internal usage and making breaking changes

I also appreciate Peter's observation https://github.com/golang/go/issues/44550#issuecomment-784568521 that Go usage inside a company can be quite different from public, open-source Go usage. Even so, there is far more open-source Go code than Go code in any one company, and the Go ecosystem is larger than any one company. When there is an unavoidable conflict, we are always going to prioritize the health and functioning of the overall Go ecosystem.

I don't believe there is an unavoidable conflict in this case though. Inside a company, where all the users of a given module can be found, there are a few different approaches that can be used. I want to look at these in the context of bytheway's comment https://github.com/golang/go/issues/44550#issuecomment-784819674:

We've faced similar challenges and decisions with internal libraries at my company. What could be a quick communication about an incompatibility and a one-line update to the go.mod to get the latest package can quickly turn into a large change set, touching many, many files.

It depends on the details of the incompatiblity, but usually the best approach is:

  • Make the behavior change in a copy of the function or method with a different name. (This is clearly not a breaking change.)
  • Then the users can update or not at their own pace. You can tell them about it, encourage it, and so on, and watch how many uses are left (or update them yourself).
  • Then, once there are no uses left of the old function, delete it. (This is also clearly not a breaking change: literally nothing breaks.)

See my article “Codebase Refactoring (with help from Go) https://talks.golang.org/2016/refactor.article” for a detailed presentation of these steps. If you're working in open source, you usually can't do that last step, because you can't find all the uses of the old function, so you leave it and mark it deprecated. But the general approach works well, and we do it all the time (just git grep Deprecated: in the Go repo). Inside a company, you can often tell when there are no uses of the old function left and then delete it, which is fantastic.

The benefits of this approach arise from the fact that it captures semantic changes at a very fine-grained level:

  • Different breaking changes are kept separate, and their upgrades can proceed independently.
  • Because names are semantically precise, unmodified code doesn't suddenly change behavior.
  • Worst case, if you miss a use when you delete the function, a build breaks, instead of silent subtle behavior changes.

Contrast that approach with redefining the name in place, bumping the major version, and updating go.mods in dependent modules:

  • If there is another, unrelated breaking change elsewhere in the module, users can't order their updates. If they are in v1 and need the change in v3, they have to work through whatever v2 did at the same time (or roll the dice).
  • The meaning of a function or method name can change, making it no longer semantically precise.
  • When things break, they break in subtle, hard-to-detect ways.

It's true that semantic import versioning adds:

  • Every file importing any package in the module has to have its import path updated, whether or not any behavior changes affect that file.

This is hardly the worst problem with this approach.

What about at small scales, where you know there's only one or two users of the function and you can talk to them ahead of time? In that case I'd argue that doing that and then just making the change is not a breaking change, provided you are absolutely sure that's everyone who is affected. If you're not sure, then renaming the function helps find the ones you missed. In contrast, bumping the major version doesn't really help identify any new affected parties. It's just a way to disclaim responsibility for the breakage.

My point here is not to say how you should or should not make changes at your company. You should do what you believe works for you. But if the scenario is to be used to justify a change that affects the entire Go ecosystem, like giving up semantically precise import paths, then it is absolutely fair to compare it to alternatives and explore how eliminating the import path rewrites would change the scenario. The answer: there is a better alternative available today the vast majority of the time, and eliminating the import path rewrite doesn't substantially improve the situation. It removes a rote mechanical step and leaves the dangerous flaws.

And of course, if you are aware of the risks of dropping the major versions but still decide that for your company repos you want to standardize on versionless imports with no possibility of multiple major versions for your own repos, then that's easy to do by pairing each require with a replace statement in go.mod, as Bryan Mills explained https://github.com/golang/go/issues/44550#issuecomment-788054415. That solution doesn't require us to change the standard tooling either. A Path Forward?

Peter asked https://github.com/golang/go/issues/44550#issuecomment-872843430, “is the concept of optional SIV acceptable to the maintainers?” As currently envisioned, where it would mean we lose precise import paths, no, I don't think that would be acceptable: it means giving up too many of the benefits of the current system. I would rather find a less costly solution to the underlying problem that this proposal means to solve. Tooling looks to be that answer. I encourage people who are interested in this to explore writing such tools and learn what works and what does not.

I would refer anyone who wants even more detail to “The Principles of Versioning in Go https://research.swtch.com/vgo-principles” (also available in video form https://www.youtube.com/watch?v=F8nrpe0XWRg), especially the objections sections at the end.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/golang/go/issues/44550#issuecomment-885324886, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAARM6PAX4M5RWUZX4CLPVTTZC2AHANCNFSM4YC2UNFQ .

Merovius commented 3 years ago

@thepudds

I would be curious to hear any additional thoughts on @rogpeppe’s comments in #44550 (comment) about the practical problems arising from large noisy diffs that can potentially cause actual bugs to be overlooked.

My approach to this is to a) split up the automated and non-automated steps (if any) into separate commits and b) provide the command to produce the automated steps. That way the reviewer can verify that the automated steps make sense, run them themselves and verify that it leads to the same diff and then review the non-automated steps manually.

rsc commented 3 years ago

@thepudds, I would have said what @Merovius said. And if the APIs are so close that only import paths are changing, then hopefully the author will leave a forwarding wrapper behind, so that the automated half becomes go fix. You submit an automated CL followed up by a manual CL, and you spend your time reviewing the manual CL and don't worry about the noise in the automated one. I've done a lot of these kinds of pairs with rf, and they work well. And git-generate makes the automated noisy one easy to regenerate across merge conflicts and the like. We can make good tooling here and not give up the readability benefits of precise import paths.

rsc commented 3 years ago

If that happens, it would be nice if 'go get foo@latest' and 'require foo latest' could then take on the meaning of asking for the latest major version. With gopls and goimports defaulting to the latest major version, that together would help new and experienced gophers alike avoid accidentally getting the wrong major version, or having to leave their command line world to find out what the latest major version available is. Or, perhaps there could be some alternate spelling of that request.

The expected workflow is really that you shouldn't run go get foo@latest or write require foo latest in your go.mod file by hand. Instead, if you are using gopls or goimports, we want those to prefer the latest major version that provides the API calls you typed in (first preferring whatever is already mapped into the module) and have them update both the imports and the go.mod. And if you are working tool-free, then the workflow is to write the import and run go get (no arguments, or go get . if you want to be explicit) to add the necessary latest modules. In that second case, the import you wrote will decide which major version is needed. We've started using that form in our tutorials, for example https://golang.org/doc/tutorial/web-service-gin.

So those two changes you identified are both very precise low-level directions that I think we should keep precise. I would also be frustrated if I wrote an import for foo v1 and those commands insisted on giving me v2.

rsc commented 3 years ago

@theckman:

@rsc reading your post has me feeling pretty deflated, because there are quite a few salient points we've made that are misrepresented or seem like they may have been completely disregarded. Likewise, I think there may be personal preferences you are communicating as the desired end goal or as a truth, that I'm not sure everyone agrees with. I wonder if we need to identify those foundational misalignments, before we can make forward progress on this topic. To accomplish that, it may need to be a simple bulleted list of statements that we either agree or disagree with.

I apologize for the deflation. As I noted at the start of my post, it was “what I think the most important parts are.” Different people will of course have different ideas about what is salient, which is fine.

You are right that the discussion here has moved away from technical problems more to a social one. It is not - and cannot be - a goal for the design of Go modules to make everyone happy or even agree. That is clearly impossible. Instead, the goal is only to provide a good, solid base for writing Go software that works well for our intended use cases, which include large-scale software development.

If there really is, as you say, a foundational misalignment between Go modules and your expectations or wishes, then it is important to recognize that the foundation has set, and software skyscrapers have been built on top of it. The time for changing the foundation has passed.

My long reply above explained the importance, as best I could, of:

Those are the axioms of Go modules, which guided the design of the entire system. You can disagree with them, but that doesn't change their centrality and importance to the design.

It seems like you still want to debate one or both of these. But we had those discussions years ago, and there is no new information that would change the outcome, so it's not productive to reopen them and expect a different result (see in particular the "Reconsideration" section of https://web.stanford.edu/~ouster/cgi-bin/decisions.php).

When we released Go 1 in 2012, we said Go the language was frozen until we had a lot more experience using it. The core foundation of modules is just as frozen at this point. We're not going to lay a new foundation.

Instead, we will continue to work on tooling to improve important workflows, including upgrading from one major version to another. Again, I encourage people who are interested in this to explore writing such tools and learn what works and what does not.

peterbourgon commented 3 years ago

@rsc

My long reply above explained the importance, as best I could, of:

  • semantically precise import paths, and
  • coexistence of multiple major versions of a package in a build.

Those are the axioms of Go modules, which guided the design of the entire system. You can disagree with them, but that doesn't change their centrality and importance to the design.

As far as I know the current suggestion for optional SIV preserves both of these properties (in the "plumbing" layer) while also making them optional (in the "porcelain" sense). Assuming that's true, would a corresponding proposal still be rejected a priori?

rsc commented 3 years ago

I assume you are analogizing to Git, which I would hesitate to hold up as an exemplar of good UX design. I've always understood the Git "plumbing vs porcelain" distinction as invented to excuse the legacy decisions they are stuck with at the lowest levels, so that usability problems in "plumbing" can be dismissed as non-issues. That's not a terrible strategy given how dramatically Git has redefined itself from the earliest (pre-cogito) days. But it's not one we need to adopt.

Go does not have separate plumbing and porcelain layers. Instead it aims to have a single consistent design and shared vocabulary for its parts.

Even hypothetically accepting the porcelain vs plumbing premise, the arguments I laid out above (especially in the "Precise Import Paths" section) are fundamentally about user experience, not implementation, which means they would be apply most directly in the porcelain layer, where you propose to remove them.

To answer the question directly: no, a proposal that breaks either or both of those properties in a user-visible way is unlikely to go anywhere, absent a very large compensating benefit.

beoran commented 3 years ago

Could I then ask what is the large benefit of allowing imprecise imports for v0 and v1? Yes, there is backwards compatibility with the pre module situation, but in my opinion, the way v0 and v1 are special exceptions leads to needless confusion. At least, could Go perhaps accept the /v0 and /v1 suffixes to allow for precise imports for those versions?

Op vr 23 jul. 2021 16:07 schreef Russ Cox @.***>:

I assume you are analogizing to Git, which I would hesitate to hold up as an exemplar of good UX design. I've always understood the Git "plumbing vs porcelain" distinction as invented to excuse the legacy decisions they are stuck with at the lowest levels, so that usability problems in "plumbing" can be dismissed as non-issues. That's not a terrible strategy given how dramatically Git has redefined itself from the earliest (pre-cogito) days. But it's not one we need to adopt.

Go does not have separate plumbing and porcelain layers. Instead it aims to have a single consistent design and shared vocabulary for its parts.

Even hypothetically accepting the porcelain vs plumbing premise, the arguments I laid out above (especially in the "Precise Import Paths" section) are fundamentally about user experience, not implementation, which means they would be apply most directly in the porcelain layer, where you propose to remove them.

To answer the question directly: no, a proposal that breaks either or both of those properties in a user-visible way is unlikely to go anywhere, absent a very large compensating benefit.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/golang/go/issues/44550#issuecomment-885664252, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAARM6JJWVBNF64UNQHINRDTZFZQ7ANCNFSM4YC2UNFQ .

rsc commented 3 years ago

@beoran:

Could I then ask what is the large benefit of allowing imprecise imports for v0 and v1? Yes, there is backwards compatibility with the pre module situation, but in my opinion, the way v0 and v1 are special exceptions leads to needless confusion. At least, could Go perhaps accept the /v0 and /v1 suffixes to allow for precise imports for those versions?

If we were laying a new foundation today, then I think a strong case could be made for having those. But we don't. The design of Go modules had to work within the constraints of the existing Go code in the world, and in particular not invalidate it all. The v0/v1 exception serves that purpose.

I made an analogy above to freezing the language after Go 1 "until we had a lot more experience using it." I neglected to mention that one of the key reasons we froze the language was to provide stability to users, so that they didn't have to worry about details changing from release to release, invalidating mental models, tutorials, books, course materials, and so on. Freezing the language made it clear that people could focus on things built atop the language instead.

When I say that the core foundation of Go modules is just as frozen, one of the key reasons why is provide all of those benefits as far as modules are concerned. There will be new tools and streamlining, but we're not going to make people relearn basics like whether and when a major version is required in an import path. Instead of rebuilding that foundation, it's time to build on top of it, and to hold it steady for others who are building on top of it.

peterbourgon commented 3 years ago

@rsc

Even hypothetically accepting the porcelain vs plumbing premise, the arguments I laid out above (especially in the "Precise Import Paths" section) are fundamentally about user experience, not implementation, which means they would be apply most directly in the porcelain layer, where you propose to remove them.

To answer the question directly: no, a proposal that breaks either or both of those properties in a user-visible way is unlikely to go anywhere, absent a very large compensating benefit.

I had always understood that including the major version in the module identifier was a means to an end — supporting multiple major versions in a compilation unit — and not an end in itself. Is that not the case?

If so, that's surprising. Avoiding the need to include the major version number in a module identifier whenever possible, and in as many user-visible ways as possible, is itself the very large benefit that we're chasing with all of this, and it's the user experience that benefits.

It's clear to me that you haven't come to the same conclusion. But it's unclear to me if you understand the rationale for why I and many others have; @theckman pointed at a few reasons for that in his post above. Are you open to that discussion? "No" is a fine answer, I'm just looking for guidance on how to proceed from here.

rsc commented 3 years ago

I had always understood that including the major version in the module identifier was a means to an end — supporting multiple major versions in a compilation unit — and not an end in itself. Is that not the case?

The ends are (1) semantically precise import paths (consistently applied throughout the Go user experience, not just in "plumbing"), and (2) coexistence of multiple major versions of a package in a build. I see no means to achieve those ends other than including the major version in the module and import paths. Do you?

I understand the benefits that would arise from not having the major version in the import path - not having to make import path edits during major version upgrades. But (1) is far more important, and the edits are easily automated.

It's clear to me that you haven't come to the same conclusion. But it's unclear to me if you understand the rationale for why I and many others have; @theckman pointed at a few reasons for that in his post above.

You seem to assume that because I have reached a different conclusion, I don't understand the points you have raised. I do understand them, just as I assume you understand the ones I raised. As I said to Tim, it's okay for us to disagree.

Are you open to that discussion?

The foundation is set and built upon. There would need to be a major shift in the ground to make revisiting it worthwhile. If such a shift happened, then I'd want to know and be happy to discuss. That's why I am not going to say, despite your asking multiple times, that the answer is a categorical no or that we are not open to any possible discussion. I have no crystal ball telling me how things might shift in the future. But today, I see no evidence of such a shift.

I'm just looking for guidance on how to proceed from here.

I've already given that guidance twice. Here it is again:

I encourage people who are interested in this topic to explore writing tools and learn what works and what does not.

peterbourgon commented 3 years ago

You seem to assume that because I have reached a different conclusion, I don't understand the points you have raised. I do understand them, just as I assume you understand the ones I raised. As I said to Tim, it's okay for us to disagree.

When I read statements like

I understand the benefits that would arise from not having the major version in the import path - not having to make import path edits during major version upgrades

it does suggest to me that you haven't understood the points I've raised. Having to make import path edits during major version upgrades is a minor and superficial symptom of the problems I'm trying to bring to the table.

But it's all somewhat moot, as statements like

Instead of rebuilding that foundation, it's time to build on top of it, and to hold it steady for others who are building on top of it.

are explicit and unambiguous. Thank you for that. It's now clear that the path forward, if any, is an alternative ecosystem.

flibustenet commented 3 years ago

@peterbourgon Do I understand correctly that your main point is:

The other "camp" (i.e. myself and others) observes that breaking changes are not always very costly and do not necessarily need to be avoided.

If it is, can we agree that at the begin of Go, even before modules, it was clearly stated that a breaking change is considered costly ? Then before modules the recommended way was to change the name of the lib. We did it with gopkg.in for example. With module we have the ability to change the name based on git tag instead of lib name, but the idea remain the same, it's just more confortable. Like @rsc suggest, i also do it with name of function to avoid breaking compatibility, it's the same idea and it just works at scale.

cameracker commented 3 years ago

If it is, can we agree that at the begin of Go, even before modules, it was clearly stated that a breaking change is considered costly ?

It can be stated, but that doesn't make it true. It can also be an overestimation or underestimation of what happens in practice.

Changing the name of a module to address breaking changes is, to me as a package maintainer, a devastating loss of SEO and name recognition, and also user discovery. That it is even seriously recommended anywhere is bewildering to me. That it is a foundational concept behind a package manager, seems like reinventing a wheel that didn't need to be reinvented.

So to your point, I think you're right that the thought behind Go assumes those postulations are true, and making breaking changes is destructive to the ecosystem. But that means that because the package manager is so dogmatic and opinionated of this idea that as far as I can tell is unique to Go, it will realistically mean people choose other languages. Most people can't see the future and know they're doing something that is going to require a breaking change later. As a whole, the thing I can most easily do to prevent myself undue burden as a package maintainer in Go is to just not make packages in Go at all.

👋

rsc commented 3 years ago

Changing the name of a module to address breaking changes is, to me as a package maintainer, a devastating loss of SEO and name recognition, and also user discovery. That it is even seriously recommended anywhere is bewildering to me. That it is a foundational concept behind a package manager, seems like reinventing a wheel that didn't need to be reinvented.

For the record, in case anyone is confused on this point: you don't have to change the name of your module when you make a breaking change. Instead, Go modules follows SemVer-style numbering: you create a v2.0.0, and to make it clear to tooling which is which, you add /v2 to the module and import paths. If you were already at v2, you create v3.0.0 and change /v2 to /v3. And so on. That doesn't give up SEO or name recognition or user discovery at all.

Xe commented 3 years ago

Does Google use Go modules within its monorepo? What are the challenges they've faced with semantic import versioning inside the monorepo, if any?

dylan-bourque commented 3 years ago

If it is, can we agree that at the begin of Go, even before modules, it was clearly stated that a breaking change is considered costly ? Then before modules the recommended way was to change the name of the lib. We did it with gopkg.in for example. With module we have the ability to change the name based on git tag instead of lib name, but the idea remain the same, it's just more confortable. Like @rsc suggest, i also do it with name of function to avoid breaking compatibility, it's the same idea and it just works at scale.

@flibustenet as someone who worked in the Microsoft COM ecosystem for years, I'll say without reservation that simply creating new types/methods with new names to accommodate inevitable breaking changes is not a panacea. The end result of that level of strictness is things like ISomeInterfaceEx::DoTheThingEx2() and IMHO that degrades understanding much more than it helps.

@rsc except that all of the Go tooling treats github.com/example/thing/v2 not as "the next version" of github.com/example/thing, but as if it's completely unrelated. It's effectively the same, at a developer level, as just renaming it to github.com/example/otherthing.

cameracker commented 3 years ago

Pardon @rsc I do understand that the major version suffix in the path does "solve" this problem, and that the "change the name" suggestion (in not so many words) predates modules for the most part. I was meaning to criticize the idea, not it's execution in modules.

But I know you have heard the debate before, and in the interest of us both having a pleasant Friday, I won't torment you with it <3

rsc commented 3 years ago

@Xe:

Does Google use Go modules within its monorepo? What are the challenges they've faced with semantic import versioning inside the monorepo, if any?

Google's monorepo builds with an internal version of Bazel called Blaze. It doesn't know about modules directly. We do the three-step dance all the time (introduce new thing, convert code, retire old thing).

We do import open-source Go modules into a subtree of Google's monorepo, and we have had no problems with semantic import versioning there. We keep at most one version of each major version in the repo, same as go get does. I count about a dozen different modules for which we currently have multiple major versions in the repo. And we have automation to take care of import path rewriting and other mechanical updates as needed.

rsc commented 3 years ago

@dylan-bourque:

@rsc except that all of the Go tooling treats github.com/example/thing/v2 not as "the next version" of github.com/example/thing, but as if it's completely unrelated. It's effectively the same, at a developer level, as just renaming it to github.com/example/otherthing.

The claim was "a devastating loss of SEO and name recognition, and also user discovery." None of those are true.

It's true that at the command-line tooling level, if you do go get -u github.com/russross/blackfriday it won't tell you about v2, but that's because your code is written against v1. There's no automated upgrade possible there, not today.

However, we absolutely do expose the existence of v2 in other people-friendly places. There is a banner advertising v2 at the top of https://pkg.go.dev/github.com/russross/blackfriday, for example, and I think the go.mod view in VSCode Go shows a tooltip (powered by gopls) to notify about the existence of v2 as well (if not, it will soon).

And once go fix exists we could also advertise in those places that a particular upgrade can be gofixed or not.

There is plenty of room for lots of good tooling here, which will provide a much better experience than just treating all major versions as being somehow interchangeable.

dylan-bourque commented 3 years ago

I look forward to that improved tooling, because it's been my experience so far that new major versions are far from discoverable.

rsc commented 3 years ago

@dylan-bourque Indeed, and we understand that and are working toward fixes in the right places. The v2 banner on pkg.go.dev is fairly new and so on.

Xe commented 3 years ago

And we have automation to take care of import path rewriting and other mechanical updates as needed.

Can you make this automation public? I feel that would help solve a lot of the problems that are leading to people asking for this feature to be optional in the first place.

icholy commented 3 years ago

@Xe @dylan-bourque this might hold you over until the official tooling improves: https://github.com/icholy/gomajor