src-d / guide

Aiming to be a fully transparent company. All information about source{d} and what it's like to work here.
Creative Commons Attribution Share Alike 4.0 International
293 stars 102 forks source link

Go conventions: Use Go modules for libraries #358

Closed dennwc closed 3 months ago

dennwc commented 5 years ago

Update the code convention to use Go modules for dependency management of libraries.

The reasoning is that gopkg.in served only a single purpose - to provide a way to lock a major library version (thus an API version) in the import path. This concept is exactly the same as semantic import versioning rule in Go modules.

The use of gopkg.in imports was a good solution for the time. Right now, however, Go modules are widely adopted by the community and supported by both Go 1.11 and 1.12. Our version policy for libraries is to support two latest release of Go, and both already support modules.

Thus, I think it doesn't make sense to use gopkg.in imports anymore. The number of project supporting modules is already massive and it will be even more true when 1.13 is released.

The best way to migrate our libraries is to increment the major version when adding modules support. This way old imports will be gopkg.in/src-d/lib.v1 and new imports will be github.com/src-d/lib/v2.

Note that old library versions must continue to work after the change. We don't want to break any of our users.

One may argue that this will require library clients to switch imports when updating to a new version. This is true, although the effect of this change is the same as any other major version bump. But instead of switching from gopkg.in/src-d/lib.v1 to gopkg.in/src-d/lib.v2 we will instead switch to github.com/src-d/lib/v2. This switch won't happen automatically and won't force anyone to update. Library users may update their imports at their own pace.

As noted in the epic (non-public link) most of our projects already switched to modules or are in the transition process.

mcuadros commented 5 years ago

I see several cons, the main one is forcing the users to use a package manager, but others are having chaotic import paths with the old and the news, updating documentation, changing Travis, looks a lot of work for zero benefits.

Sorry guys but I am blocking this for one reason: I strongly believe that this is not the right moment to do any migration of nothing, but especially of something that doesn't bring a big value, and will take years on be complete.

This is the moment to deliver value to our users.

dennwc commented 5 years ago

@mcuadros "Won't fix" is not the right resolution for this. It rather "not now", if I understood it correctly. So it probably makes sense to remove the label and block the PR instead. "Won't fix" is more in line of "we would never fix it".

I see several cons, the main one is forcing the users to use a package manager ...

Which one? We support two latest Go versions, and this convention was there for a while. And both those versions work well with modules. Even if the user's project doesn't use modules, they won't have to switch to it to use new packages. In other words, go get will still work with /v2 and will lock version the same way as gopkg.in, even though user don't use go.mod file.

... will take years on be complete ...

This is nowhere near a real estimate. The transition happens fast and we already see that most of our dependencies have already switched. The reason is that it backward compatible, as long as 1.11.4+ is considered.

Having said that, you are right that some of our libraries have a more broad community and support 3 Go versions, like go-git. So instead of force blocking this for every our project, I propose to evaluate it based on the Go versions the project aims to support.

For example, go-git is very popular and still supports Go 1.10 (although I don't see it mentioned explicitly in the README). So for this specific project switch to Go modules must not happen now, since go get for 1.10 will break with /v2 imports. This decision may be re-evaluated only when 1.13 becomes stable.

If we consider Babelfish Go client as an example, we don't have that many external users and we support Go 1.11 and 1.12 (as per source{d} convention). Both those versions understand /v2 imports in both "Go modules" mode and "GOPATH mode". So we won't break anyone by switching. The same may be true for other our projects.

This is the moment to deliver value to our users.

This is true, right now source{d} as a whole is focused on delivering the value through PoC and beyond. But this is a different topic. We may say "go" for the convention, but postpone the transition for existing projects (where it makes sense). This way at least new libraries will use Go modules by default (as most of the Go community does now).

creachadair commented 5 years ago

Besides agreeing with @dennwc's analysis, I have a couple of additional comments:

@mcuadros:

I see several cons, the main one is forcing the users to use a package manager

Depending on what you mean by a "package manager", we were already doing this with dep. Using Go modules actually simplifies the situation for users who do not want to use additional tools, since we have carefully preserved compatibility of import paths.

I think that we should adopt Go modules as our standard—but even if we do not, all Go users are already using Go modules provided by the Go team, and many of our own dependencies have already switched. It's worth noting that their switching didn't "force" us to adopt modules, and our switching will not force our users to do so either.

Is there a specific problem or concern with dependency management that led to your objection here?

@dennwc

Having said that, you are right that some of our libraries have a more broad community and support 3 Go versions, like go-git. So instead of force blocking this for every our project, I propose to evaluate it based on the Go versions the project aims to support.

I think this is a crucial point, and maybe we should call this out specifically in the guide: Adopting Go modules does not mean we have to migrate everything all at once, and I think we all agree that we do not want to break existing users without a very good reason.

Perhaps to address this, we should say specifically that "projects whose existing use differs from these conventions are encouraged, but not required, to migrate". Or words to that effect.

@mcuadros

This is the moment to deliver value to our users.

I completely agree. Ideally, in fact, we should never have to stop user-relevant work. One important argument in favour of this proposal is that it allows us to continue delivering value to our users, and not get stuck in three to six months when the Go tools require modules by default for all users, having to do a big cleanup.

If we change the conventions now, and migrate each project as it becomes practical to do so, we will not have to block any user-facing work.

Would clarifying the language around how & when existing use gets updated help address your concerns?

mcuadros commented 5 years ago

When I main forcing using a package manager, I mean forcing to everyone, not only us.

For example, the change of enry, ended me having wasting 1h figuring out what didn't work at my old code (two years old), causing not delivered what I wanted, since at any part of the package was clear that is using go modules. Also if you don't have go modules active, go-get of enry simply fails. Again nothing is documented about these problems.

So this small changes that you are suggesting are wasting money. Simply what I think is that adding more job to our plate getting zero benefits doesn't look right.

About the wont-fix, of course, is not now. But I rather close the issue since, with go you never know, maybe in the next versions we have a completely different package manager. For me go author has lost all my trust about package management.

creachadair commented 5 years ago

For example, the change of enry, ended me having wasting 1h figuring out what didn't work at my old code (two years old), causing not delivered what I wanted, since at any part of the package was clear that is using go modules.

Can you please point to the code that failed? I'd like to try to reproduce the errors you describe, and figure out whether this is something we did wrong. Even if you've since modified the code that was affected, can you point to the repo and commit that was active when you had this problem?

Also: Is the code itself old, or is the version of Go that you were using to build it old? (Or both?)

I think the right way for us to deal with errors is to treat them as bugs, not to make policy based on an incomplete understanding of what went wrong.

So this small changes that you are suggesting are wasting money. Simply what I think is that adding more job to our plate getting zero benefits doesn't look right.

I do not think that is a correct characterization of the issue. As far as I can see, you hit a bug that might have been our fault, or might have been an issue with modules—right now, we don't know which. Before we write this off as a waste of time, I'd like to understand what the real problem is.

Even if we decide not to use modules right now, we need to sort out this issue: With or without modules, arbitrary old code will not continue to work indefinitely across major revision (API) changes. For Babelfish and Enry, we bumped the major revision explicitly to signal that change.

bzz commented 5 years ago

Just for the context - here is a log of migrating a small piece of code to v2 without using Go modules. TL;DR: with a recent Go version it seems to work well (only an import string needs to be changed).

dennwc commented 5 years ago

I agree with @creachadair, we must first know exactly what is the issue before jumping into conclusions. I'm pretty sure we did something wrong when migrating Enry, if you are hitting any issues.

Minimal support importing libraries with modules support was backported even to 1.9.x (search "minimal module compatibility" on this page). So it must work properly even if you try to use previous major Go releases (assuming it's up-to-date) in an old project that doesn't use modules.

mcuadros commented 5 years ago

Well the error is very clearly explained by Alex here: https://gist.github.com/bzz/867c665ec886fdc08d070fec386a2999#carvet

Its a waste of time, first because already spent a lot of time about functionality is not giving us zero benefits, and already has taken a lot of time.

I do not think that is a correct characterization of the issue. As far as I can see, you hit a bug that might have been our fault or might have been an issue with modules—right now, we don't know which. Before we write this off as a waste of time, I'd like to understand what the real problem is.

In a code that should be reversed so, the bug will be gone with it. As explained by @smola in the chat, he approved this change without taking into consideration the amount of work and the problem that this change brings with it. We were doing go modules in go git for more than 8months without a problem the problem is using the import parts on it, and for to the user to activates the GOMODULES, us and the rest of our users.

Introducing in our stack something that doing the suggested install method fails, doesn't look like the right thing to do

mcuadros commented 5 years ago

@dennwc @creachadair can you explain to me the benefits you see doing this. Because I don't see any.

creachadair commented 5 years ago

@dennwc @creachadair can you explain to me the benefits you see doing this. Because I don't see any.

Without modules (or something like it), it is nearly impossible to get reproducible builds. Someone building a package gets whatever version of the package happened to be pointed to by the gopkg.in URL at the time when they fetched it. With modules, you know exactly which package versions you depend on, and can upgrade (or downgrade) as needed to get a working result.

Without accurate dependencies, even if there are no API changes between minor versions, we have no good way to find out what exact version someone has when something breaks. You can ask them to look in their GOPATH, but it might already have changed if they updated another package. With dep and modules, you know exactly which bits each package depends on, and can update them in a safe, controlled way.

This is especially important now that we are focusing on delivering customer-facing products, instead of only libraries for general use. Even within source{d} we've already run into problems where lack of dependency management makes it hard to debug version issues. That's why we started discussing modules in the first place (not just LA but the whole engineering team).

The costs of migrating to Go modules are small (and incremental) for us; the benefits are: reliability for our customers and our own teams. If a random program imports one of our libraries at an old version, it should still work even if we are using modules.

The only time an import needs to be updated is if the importer wants to upgrade to a newer version of the library. But they'd have had to do that even without modules, when the major version changes.

mcuadros commented 5 years ago

@creachadair I am not against using go modules, I am against using the paths. As I said we are using go modules on go git since 8 months

https://github.com/src-d/go-git/commit/83649a1c08e0dcc103de54652ed5b9c33d301326

Also you have some other examples how we are doing this at: https://github.com/src-d/go-mysql-server https://github.com/src-d/go-borges

BTW What was suggested by @dennwc was exactly what I believe doesn't bring value, replacing gopkg.in with go modules paths.

creachadair commented 5 years ago

@creachadair I am not against using go modules, I am against using the paths.

Can you be more specific about this? Except in the one case where we renamed the repository, all the existing import paths are expected to continue to work.

We currently have a mix of native and gopkg.in import paths across our various projects (you mentioned specifically that "having chaotic import paths with the old and the new" is something you would like to avoid). That has already caused confusion both for us and users in our Community Slack. Without modules we were stuck with that: With modules, we no longer need to worry about it. We can have consistent naming and all the legacy imports should continue to work.

So in that case, making the convention to prefer native GitHub paths does not break existing use, does not cause a lot of extra work, and reduces confusion for our users. I am having a hard time seeing any real downside to the change. It sounds like you had a very specific issue, though, so I would like if possible to hear more about it. What happened, and what errors did you see?

dennwc commented 5 years ago

If I understood correctly, the error mentioned by @mcuadros is this:

$ cd $GOPATH/src
$ go get github.com/src-d/enry/v2
package github.com/src-d/enry/v2: cannot find package "github.com/src-d/enry/v2" in any of:
        /snap/go/3739/src/github.com/src-d/enry/v2 (from $GOROOT)
        /home/dennwc/go/src/github.com/src-d/enry/v2 (from $GOPATH)

Although it works without /v2 (with no changes to the repository):

$ cd $GOPATH/src
$ go get github.com/src-d/enry
 (no error)

Note that in both cases the repository will be downloaded properly.

mcuadros commented 5 years ago

To be clear, I don't want to change the import path of any of our projects now. Since I don't see any reason for it. And I only see problems, as the one described by Alex.

If someone was having problems with gopkg.in, maybe is because the repository wasn't configured properly with the vanish import comment, because with this the error is pretty clear.

> go get github.com/src-d/go-git  
package github.com/src-d/go-git: code in directory /home/mcuadros/workspace/go/src/github.com/src-d/go-git expects import "gopkg.in/src-d/go-git.v4"
creachadair commented 5 years ago

To be clear, I don't want to change the import path of any of our projects now. Since I don't see any reason for it. And I only see problems, as the one described by Alex.

So is your main objection to the advice to migrate existing imports? In other words, if we recommend that new repositories use native GitHub imports and that existing repositories preserve compatibility with gopkg.in URLs, would that satisfy your concerns?

One important note about @bzz's gist is that in that case the (hypothetical) user had to modify their import path either way because they wanted to upgrade to a new major version. With Go 1.12.x, go get gopkg.in/src-d/enry.v1 still works just fine, and requires no change. Upgrading to v2 necessarily means they have to modify the import path regardless of what we do on our side.

dennwc commented 5 years ago

... I don't see any reason for it. And I only see problems ...

It actually adds a few benefits for us as developers. gopkg.in is an external proxy that we don't control, thus we cannot change the version that it exposes for each major version.

It's quite common to use replace ... => ../some/path when hacking locally, but this approach won't work for the CI. When our libraries use a Github import, it's easier to switch to a specific library branch when working on the PR and make CI to test this version before the new library release.

With gopkg.in we are forced to update in lock steps: release some changes to the libraries (thus wait for the PR review), then make changes to the app that uses the feature. The CI in the app repository may fail at this point, and we have to make another library release to fix it.

With modules, we can push library changes, make a PR, don't merge it yet, make PR to the application with a replace, make sure everything works, release the library and remove replace in the application. This is a way faster and prevents bugged release from being created.

One may argue that the same will work for gopkg.in import when replacing to Github imports, but this is not true in some cases, especially when testing changes for the next major release or when the library uses internal packages.