Open mattfarina opened 8 years ago
@mattfarina thank you for opening this issue and being open to thoughts from the community.
https://github.com/Masterminds/glide/issues/298 is at least part of what prompted this issue. I, and many others on that thread, would like to be able to use Glide in conjunction with GitHub Enterprise and other code hosting solutions without resorting to adding the ".git" suffix all over the place. Note that currently the only workaround is to not only add ".git" to glide.yaml, but to add it to every place in your source code where the package name appears, as well as to rename the relevant directories in GOPATH to include the ".git" suffix.
go get
does not work with nested packages on GitHub Enterprise (without specifying the ".git" suffix). Any solution that we come up with for addressing this issue is going to be a departure from go get
.
For my public projects, I will continue to maintain go get
compatibility so that users will not be forced to use Glide if they don't want to. At work, we have a clearly defined process for hosting code on GitHub Enterprise and using Glide to manage dependencies. We don't use go get
directly, so there won't be any adverse affects if we break compatibility.
I would be curious to hear if anyone would be adversely affected by breaking go get
compatibility in this way.
No. I don't want Glide to be orthogonal to Go's build process; instead it should work with & enhance it. One reason I use Makefiles to manage my projects' build processes instead of something like gb is because it replaces Go's toolchain and the surrounding ecosystem instead of enhanching it. Sure, the go
command makes some odd choices. For one, using a Makefile would soo much easier if the Go packages weren't compiled with the pack
option by default. An object file per source file would make things far simpler to manage via Make. However, I worked around that instead of invoking the Go compiler directly. It's just too much of a moving target to try and replace these fundamental tools.
To that end, if Glide were to break compatibility, I'd just have to stop using new versions of Glide. As crappy of a boyfriend it makes me, I think this is the classic Elaine issue from Seinfeld. I won't date someone who is a bad breaker upper. In this case it means I don't want to buy into a tool that I cannot easily replace if something happens to its maintainer or its direction. I love Glide today because even with all the tremendous value it adds, if @mattfarina turned out to be an alien who is returning to his home planet and taking his code with him, ultimately it won't be the end of the world. I may have to replace Glide, but I won't have to shift my understanding of my projects' entire dependency workflows.
I never include the extension and it works fine for me.
An issue brought up in other tickets... just to add context... is the Great Firewall of China that blocks some requests to sites that offer the ?go-get=1
redirect functionality. This breaks Glides ability to put something in the right place even when they can set the repo to the right one and access it.
Without suggesting how to fix this it is one that would be nice to fix because there are so many users there.
@albrow GitHub Enterprise is an easy-ish one to make better without breaking compatibility (technically). go get github.example.com/foo/bar
will work because that will invoke the ?go-get=1
behavior. It's when you get to sub-packages that a problem comes up.
In a glide.yaml
file the top level package
for an import is the repo level. glide get
could figure out this separation for you a little more intelligently.
This isn't a break in compatibility for the import
statement in code.
IMO, the case @mattfarina presented in the OP actually has very little to do with glide. This is the kind of thing that's easy to go in circles on, because the question we're asking is kinda circular:
The go
toolchain has variable answers for the first (owing to the regex-cum-?go-get=1
behavior that admits multiple schemes), and no capacity to give anything other than one, strict answer to the second.
glide, even today, has different, more variable answers for the first, and a looser, though still mostly single answer to the second. That repo
field is what blows it all wide open:
?go-get=1
redirect) can service a rooted import path from an arbitrary URL - and that one can supercede the other, or combine with each other. So, as @albrow pointed out over in #298, adding the repo
field technically means glide has already broken go get
compatibility. It's already possible to express import path->network location relationships that go get
would miss. This is not because of differences in parsing, but because go get
doesn't know about these client-side transforms. This might seem like a different kind of compatibility, but it's not - that's just the circular nature of the problem playing tricks.
Take, for example, the particular case presented in the OP:
- package: example.com/foo/bar repo: https://example.com/foo/bar.git subpackages: - baz
And then reference
example.com/foo/bar/baz
in theirimport
statements.
The only reason we're talking about this case is because the obvious transformation from repo
to package
is obvious. And yet, this:
- package: github.com/Masterminds/semver repo: github.com/sdboyer/semver
is TOTALLY go get
incompatible, but glide supported it just fine in #384 for several weeks. In the game of go get
compatibility, these are not different cases. Trying to separate these cases is a mistake, based on a superficial red herring - the obviousness of the transform for this particular case.
Moreover, the red herring is harmful - it takes us wandering into the territory of the second question I initially gave - "what import path does a given network URL correspond to?" Here's why:
example.com/foo/bar.git == example.com/foo/bar
, then example.com/foo/bar.hg == example.com/foo/bar
...so, example.com/foo/bar.git == example.com/foo/bar.hg
?? Clearly not, so, how do we know when these equivalencies apply?https://example.com/foo/bar.git
whether it internally expects to be referenced with or without the .git
. All answers to that involve code scanning, and some involve import path rewriting. Also, lots of new combinatorial failure modes.Bottom line: we can't answer any of these classes of questions, or even know how to apply configuration that a user has explicitly given, until we've inspected the source. And THAT is itself version-specific...because what if the author of example.com/foo/bar
reads this discussion, and changes its internal root import path from including .git
, to not including it, or vice-versa? Now, no matter how you declare it, it's a mismatch for osme part of the history.
So, my thought: glide has already broken go
compatibility. Embrace it, but do so by adding regex-based matchers. Something like this:
- repo-patterns
- my-ghe
- pattern: '^(?P<importroot>(?P<root>github\.example\.com)/([A-Za-z0-9][-A-Za-z0-9]*[A-Za-z0-9]/[A-Za-z0-9_.\-]+))(?P<packages>(?:/[A-Za-z0-9_.\-]+)*)$'
Total pseudocode there, but the point is, named capture groups with specific, magic names, which we could then plug in to a generic matching algorithm. People can write support for their own hosting platforms into their glide.yaml
, or into a global config.
I disagree that enabling a package to point to a different repo location is breaking compatibility. I guess I need go get compatibility
defined. To me it still adheres to it because no code is being rewritten. I don't mind that @mattfarina and Glide are simply being cheeky about what they clone and where. To me that's far more preferable than rewriting import statements.
Maybe it is just silly semantics - we do need a definition. But import rewriting, while IMO worth contemplating, should not be conflated into this property. It's a wholly different kind of operation relying on a different class of information.
Some kind of declarative solution, whether it's the regex or something else, should be sufficient for this need, and render glide no more "broken" than it already is.
Hey, I'm all against import rewriting. I thought that's what you were suggesting. My point is if Glide didn't enable importing forks into primary repo locations, import rewriting would be required. Glide avoids this by virtue of what I mentioned. To me that doesn't break go get
compatibility specifically because it avoids import rewriting. To me Glide enhances a process without breaking it. If it rewrote imports, then, IMO, it would be breaking it.
sorry, i edited that too much, i meant:
But import rewriting, while IMO worth contemplating, should not be conflated into this property. It's a wholly different kind of operation relying on a different class of information.
to be in an agree-y tone, not a disagree-y tone :)
glide has already broken go get compatibility
I was drafting a response to point this out, but it seems @sdboyer beat me to it :)
In my opinion, there is not much merit in continuing to discuss whether Glide should break compatibility with go get
in a general sense. By any useful definition I can think of, Glide has already broken it (unless you commit the vendor directory). It might be more prudent to look at these issues on a case-by-case basis.
This may be obvious, but no one has mentioned it yet. Even without specifying custom remote locations, it is trivial to set up a project which works with Glide and does not work with go get
. Imagine you have a project that depends on version 1.0 of github.com/lib/foo. The latest version (on master) is 2.0 and is not compatible with version 1.0. Sure, go get
knows how to download github.com/lib/foo and put it in the correct location in your GOPATH
, but since it always pulls from master, the code could be broken and it may not even compile. On the other hand, if you install dependencies with glide install
it will work. Practically speaking, I don't think it matters whether or not go get
can download the dependencies from a remote location. What matters is whether your code actually works.
If you care about strict compatibility with go get
, there's always the option of committing the vendor directory to version control. As I understand it, that will work with go get
regardless of the remote locations or versions that Glide uses (that's not to say it doesn't come with its own issues). With the --strip-vcs
option and other ongoing work, it appears that Glide is making this easier.
I think we need to define compatibility so let me take a shot.
go get
has a certain (and evolving) structure it supports reading and finding.go get
pulls outside packages into your GOPATH
. Once there you can manually manage the checked out version, alternate remotes and the commits they have. In essence, go get
is a project retrieval tool that does not deal with versions. Because of this we can't do an apples to apples comparison between go get
and Glide.Glide package
paths and those used in imports
code statements need to be the same format that go get
requires.
The way to look at solving for this is from a user experience point of view. What experience will give the users the fewest WTF moments and why?
The way to look at solving for this is from a user experience point of view. What experience will give the users the fewest WTF moments and why?
I wholeheartedly agree.
Glide
package
paths and those used inimports
code statements need to be the same format thatgo get
requires.
I want to ask for some clarification.
It seems to me that the only advantage of adhering to this definition of compatibility is that you can run go get
on the project. go get
will successfully download the dependencies and put them in your GOPATH
. It doesn't mean that go get
will retrieve the same version or use the same remote location that glide install
does (because it has no concept of versions or custom remote locations) and consequently it doesn't guarantee that your code will work or even compile. As you mentioned, you could take some additional steps after running go get
in order to get the code to compile/work.
Would you agree with these statements? Are there any other advantages of adhering to this definition that I did not mention?
I did have a head scratcher because I assumed the example in the opening post should work but didn't. However in some cases it looks like it works, for example github.com/bradfitz/gomemcache/memcache
, this is because github.com
is recognised as a special case.
Because of that I think the example in the opening post should work in the same way.
Maybe it can be implemented as a command flag.
@mattfarina it's been more than a month since we've heard from you on this thread.
I don't see how the definition you have proposed is helpful for developers. I would love to hear some examples or learn more about your reasoning.
I already mentioned that by design Glide will install different versions of a repository compared to go get
. In practice this means a project that uses Glide may already break if you get the dependencies with go get
, because the code will expect the versions specified in glide.yaml
and go get
will always install from master. Sure, you can go in and manually check out the correct version for each dependency, but that does not seem practical. In addition, doing so is likely to break other packages which may depend on different versions of dependencies in your $GOPATH. Given the impracticality, and in some cases impossibility (if you have projects depending on different versions) of manually checking out the expected version, I would already consider this "broken" for all intents and purposes.
Furthermore, since Glide currently allows you to specify a different remote location via the vcs
and repo
fields, it is already possible to have go get
and Glide disagree on what remote repository should be installed. The project may technically be go-gettable, but go get
will install from a remote location that is completely different from what the project expects. Installing a dependency from the wrong remote location is strictly worse than not being able to install it at all. The former takes more steps to correct (you have to first remove the incorrectly installed dependency) and is potentially very confusing and hard to track down.
I have provided two reasons why your definition of compatibility with go get
does not prevent developer headaches or wtf moments. This is why I'm claiming that definition is not useful.
https://github.com/Masterminds/glide/issues/298, which partly sparked this discussion, describes a real-world annoyance that I and others have run into. Unless you can provide counterexamples, I maintain that fixing that issue will definitively not break compatibility with go get
any more than we already have.
As I have already mentioned, if you care about go get
compatibility, we do have an option for that. You can check in the vendor directory to version control. I'm not saying it's a perfect solution, but it will maintain complete compatibility with go get
, and I would argue it is actually the only reliable way to do so. As soon as you create a tool (such as Glide) which installs something other than what go get
installs, you have already broken compatibility with go get
by any useful definition I can think of. This is sometimes okay :) As engineers we need to understand that and weigh the pros and cons.
I would appreciate hearing your thoughts.
The definition I was referring to, in terms of scope, is with the paths used in import
statements and the ability to figure out where to get the source for the remote path. This relates to the code scanning.
For example, when github.com/example/foo/bar/baz
is encountered there are special rules used to figure out where to fetch from. Or, using the go-get
redirect we can see the golang.org/x/net
and figure out where the source is due to an intermediate redirection server implementing an API. Or, git.example.com/foo/bar.git/bax
can be discovered by reading the URL against rules.
Each of these works with go get
and Glide today for package paths.
The question is, should Glide instead allow you to map:
- package: foo/bar
repo: https://git.example.com/baz/foo.git
subpackages:
- bar
go get
will not be able to figure out where foo/bar
resides remotely when found in an import
statement. This breaks the mapping of package path to remote location mapping with go get
.
When it comes to the version of the source to use there is an intentional break.
No, it should not be allowed.
I would like to add a use case here. I have to use a VSTS git repo and I would like to use glide. The issue is that those repo don't use the .git extension (and in addition the url has a space in it...): https://mycompany.visualstudio.com/a%20name%20with%20space/_git/mypackage
Today I use glide like that:
import:
- package: mypackage
subpackages:
- mysubpackage
repo: https://mycompany.visualstudio.com/a%20name%20with%20space/_git/mypackage
vcs: git
ignore:
- mypackage/mysubpackage
it does not spit out any error and does seems to install & update the dependency but its far from ideal. I don't think I should have to manually ignore all the subpackage in my package.
@Cedric-Venet is foo.visualstudio.com
that a central hosting service? If so, IMO that's actually something that we should be able to resolve with a separate import pattern, sorta orthogonal to the discussion in this issue.
@sdboyer yes the pattern seems to be [acount].visualstudio.com/[project]/_git/[repo] so adding a pattern would be great, but I am not sure it would solve my use case as go did not seems to access import with space in it and using %20 neither so even if glide where to get the dependency correctly, I could not use them without having some way to alias it.
@mattfarina I understand pretty well what your stance is. I'm moreso asking why? Given the points I have brought up, how does adhering to this definition of compatibility help developers?
Among other things, I'm asking why should we allow this:
- package: github.com/someuser/differentrepo
repo: https://git.example.com/baz/foo.git
subpackages:
- bar
And not this?
- package: foo/bar
repo: https://git.example.com/baz/foo.git
subpackages:
- bar
In
glide.yaml
files thepackage
needs to be compatible with names thatgo get
can fetch. This is true even for cases where arepo
is specified to get the package from. This helps us to maintain compatibility withgo
tools including the names referenced in the source.But, on multiple occasions people have wanted to specify something like:
And then reference
example.com/foo/bar/baz
in theirimport
statements.This will break compatibility with
go get
. This is the case you can'tgo get example.com/foo/bar/baz
.Instead you'd need to
go get example.com/foo/bar.git/baz
but that also means your import statements need to reflect that path.So, should Glide break with
go
or stay compatible? Why? What implications concern you?