Closed pib closed 8 years ago
Thanks for the PR!!! I'm sorry I'm being slow on this, but work is kicking my ass this week. I'll try to dive in either today or tomorrow though.
So I'm playing with the goat gen
command and I'm a bit confused how it's supposed to work. I basically copied the setup I have in the tutorial but never made the .go.yaml
file. When I used goat gen
it gave me the error:
Couldn't find github.com/mediocregopher/goatproject/foo in any of [/opt/go/src/pkg /home/mediocregopher/src]
Is it necessary that I already have the library installed on my box for the gen
command to add it as a dependency? If so, can that not be the case? I want to avoid having to have the user do anything involving global dependencies when using goat. It should all be completely contained inside the goat project.
The reason it looks for the package is because it looks for git or hg directories in those projects to add the specific version that is installed. I suppose it could leave the current version out if it can't find it, but it also wouldn't be able to identify multiple packages from the same repo in that case (it does that now and only adds one entry to the .yaml file).
"goat get" of the desired package will install it non locally... On Jan 14, 2014 10:28 PM, "Brian Picciano" notifications@github.com wrote:
So I'm playing with the goat gen command and I'm a bit confused how it's supposed to work. I basically copied the setup I have in the tutorialhttp://docs/tut.mdbut never made the .go.yaml file. When I used goat gen it gave me the error:
Couldn't find github.com/mediocregopher/goatproject/foo in any of [/opt/go/src/pkg /home/mediocregopher/src]
Is it necessary that I already have the library installed on my box for the gen command to add it as a dependency? If so, can that not be the case? I want to avoid having to have the user do anything involving global dependencies when using goat. It should all be completely contained inside the goat project.
— Reply to this email directly or view it on GitHubhttps://github.com/mediocregopher/goat/pull/19#issuecomment-32333949 .
I mean "goat get" will install it non-globally..
The reason it isn't able to right now is because it depends on being able to import packages (via "go/build".Import) to be able to detect which packages are nested within the same directory. Without the packages being installed, it can't really tell if they are nested under the same repository.
Perhaps a better option would be to have it try to "goat get" any dependencies that are missing? It would need to have the package path set first so it doesn't try to fetch any sub-packages which are imported.
In my previous comment I didn't really understand the error I was seeing. I have a project with the path github.com/mediocregopher/goattest
, with the .go.yaml
file. Inside that there is a sub-package, github.com/mediocregopher/goattest/foo
(this is the setup I use in the tutorial). I think the gen
command is trying to generate an entry in .go.yaml
for that sub-package when it shouldn't.
Oh! Of course. I had been holding off on sending this pull request and I had forgotten why.. whoops.
There needs to be some solution for the initial setting of the "path" setting in the .go.yaml file, to avoid issues such as that.
Perhaps "goat gen" could be a more interactive command, prompting the user for answers to such things as "what is the import path of your package" and "do you want me to automatically install imported packages into this goat environment so the proper list of top-level packages can be determined?"
What do you think? It seems to me that it would useful as a one-time interactive start, then "goat gen -update" and "goat freeze" could be more automatic once the initial setup has been done. Of course there could also be command-line flags to specify the path and that it should or shouldn't auto-fetch packages.
I think the only question which needs asking is "what is the import path of your package", because with that you can determine sub-packages and ignore them (as well as fill in the path field in .go.yaml). I'm not for sure, but I don't think it's necessary to actually download the packages in the gen
step for any technical reason, and I'd rather keep the functionality between gen
and deps
non-overlapping.
Also, is it actually necessary to have an -update
flag for gen
? I guess it would prevent someone from overwriting their previous .go.yaml if they didn't know what they were doing, but that's what version control is for, and the vast majority of the time you're going to want to replace the existing one.
Is this PR still alive @pib? It resolves a serious gap in goat functionality for large dependency graphs (namely, that any dependencies using the git/hg type have to have all their dependencies specified manually, which expands the size of .go.yaml
by an order of magnitude). I'd love to see this get integrated or, failing that, take over it myself.
For whatever my opinion is worth, here's my thoughts on how goat gen
should behave...
Scenario 1: if .go.yaml
already exists, goat gen
should expand it without modifying any of the already-specified dependencies. I think the following behavior is sane and least surprise-inducing:
.go.yaml
, add it with the default type of goget
.type: git
or type: hg
package in .go.yaml
that is not currently cloned to .goat/deps
, it should be cloned now, before proceeding to the next bullet point. I don't really like this behavior (ideally, goat gen
should only ever modify .go.yaml
- maybe have a flag to toggle that strictness), but it's the only way to resolve that package's dependencies. And since you already set the type of the package, I think it's reasonable to assume that you have also set a suitable revision, and you are fine with whatever will be cloned.type: git
or type: hg
package in .go.yaml
that is cloned to .goat/deps
, inspect its source code for dependencies, and add any missing ones to .go.yaml
with type: goget
. Same behavior as the first bullet point, except instead of inspecting your own package, you're inspecting a local version-controlled dependency..go.yaml
that have type: goget
. go get will fill in the entire dependency subtree when it is invoked, so there is no need to do any further modification here.What this means is that I can create a blank project with no source code, scribble out my top level dependencies and project path, and then invoke goat gen
. It fills in all the second-tier dependencies for me automatically, and they will all be using type: goget
. I can then inspect the newly added dependencies to decide which ones need to be fixed (change the type to git/hg) and which ones are safe to leave with go get
. I change the types for those packages and invoke goat gen
again. That adds another group of goget
dependencies for the next tier. I can keep doing this until I have specified only the subset of my dependency tree that matters to me. goat gen
would ensure that any other subtrees are being fully resolved by go get
. And since it does not modify any already-specified deps, I can invoke it iteratively over and over until I'm satisfied.
Scenario 2: If .go.yaml
does not exist, you have to specify the path of your current repo on the command line, otherwise goat gen
fails with a suitable error. Eg goat gen github.com/foo/bar
. It then scans the current package for any go source code and populates the top level imports of that source code in .go,yaml
, using the default type: goget
. This gives me a basic .go.yaml
. I can then apply the same steps I took in the first scenario to refine the dependency tree as needed.
Side note: if you specify goat gen path/to/repo
when your .go.yaml
already contains a path, what happens? The behavior here would need to be defined. I say clobber the existing .go.yaml
path. I find it difficult to believe you would invoke goat gen
with the wrong path unless you intended to change it.
I agree with @mediocregopher that I don't see a use case for a --update
flag. What would this do? A bundler update
-esque command doesn't have meaning for Go since we don't have semantic versioning constraints like Ruby does. goat uses .go.yaml
declaratively, so if you want to move a dependency forward, you should only have to edit your .go.yaml
and then run goat deps
again. If you've manually tweaked your dependencies and want to reset them to the originally specified versions in .go.yaml
, it should, likewise, be enough to just run goat deps
which would force the .goat/deps
directory back into the specified state.
No opinion on the freeze command, I'm not entirely clear on what it does... is this similar to, say, godep save
? Assuming it is, I think you would use this command if your project was using a global GOPATH and you wanted to transition to goat. Not a useless command, but the functionality is, in my opinion, a separate discussion from goat gen
. goat gen
should be responsible for adding missing entries to .go.yaml
so that it specifies a complete dependency tree. goat freeze
(or goat save
or what have you) is responsible for modifying the specified revisions of existing dependencies. They might have similar code under the hood, but I think they should be taken up independently.
A few other thoughts on goat freeze
that came to me just now, while proofreading my comment:
goat gen
(to populate .go.yaml
with dependency entries) and then goat freeze
right afterwards (to migrate the global, manually set revisions into your new .go.yaml
. I'm personally not convinced that goat freeze
should just do this whole charade for you... suppose a .go.yaml
does not exist and you invoke goat freeze
. Does it just use VCS types for every single package in your dependency tree, and set the revision to whatever you have checked out globally? (Eh, well that's not a terrible behavior, but I'm not sure if it should be the default. Personally I'd think that if goat freeze
is run on a project that doesn't contain .go.yaml
, it should exit with an error.)goat freeze
know which deps to freeze? I'm skimming the code, looks like it only freezes the ones that use type: git
and type: hg
. That's a good default in my opinion, but it means that goat freeze
does not follow up with the default behavior I described for goat gen
in an empty repo. As far as I'm concerned, this is just another reason that freeze and gen should be kept strictly separated CLI-wise.freeze
anyway? I never really understood this name. The only place I've seen it before is pip (not a pythonista, so I don't really know what it means in that context either). I will admit that I can't come up with a less ambiguous name at the moment. For example, godep has three delightfully ambiguous command names (save/update/restore), which don't really tell you anything about the commands unless you read the docs. I guess an unintuitive name is not the end of the world.I agree we should leave the freeze
discussion for another time, and just focus on populating .go.yaml
with missing dependencies.
@tummychow I like your plan for going forward. I have two reservations (neither of which I have solutions for, so I'm not really being helpful here :P)
gen
. To me that stands for "generate" which would imply creation from nothing, when that's only 50% of what it does. Maybe populate
? That may be too long though. Would be interested in if you or @pib had any other ideas on that.gen
and deps
having overlapping functionality. I don't see anyway out of it, since we do have to fetch dependencies to find out if they have any dependencies. It's also a bit weird that only git
/hg
deps would be actually fetched (if I read the proposal correctly). We might want to implicitly call deps
at the end and tell the user we're doing it so they're completely good to go once everything is said and done.Apart from that, I like the idea, although I won't have time to implement it myself for a while (still moving in to my new place, work, bleh). If you @tummychow want to take a stab at it be my guest. I have one implementation detail I'd like to go over if you do want to do it though:
While you broke the behavior up into two different scenarios, I'd like the code to basically only actually have one scenario. At the start of the call there's going to be a "state" variable of some sort, which contains all dependencies discovered and accounted for thus-far (initially empty). As you go through the code and its dependencies you'll be going back to this variable to check if the dependency has been found, and if not add it to the "state". At the end of everything this "state" variable will be used to generate the whole .go.yaml
file, which will clobber the old one. If there is a .go.yaml
file to start with, this would be used as the initial "state" variable, instead of an empty list. So the code-path would look like:
.go.yaml
already exists, use that to generate "state" and continue with thatgit
/hg
dependencies, deal with thosegoget
.go.yaml
file, overwriting the old one if it was there.deps
, to get the goget
ones if we didn't need to previouslyLet me know if that didn't make any sense, I'm not the best at explaining <_<
Sorry to be dictatorial about this, and I'm definitely open to suggestions/improvements, I just didn't want to end up with two different codepaths for this command; that would make things more complex both in code and for the user using the command.
Name's not important to me, I'm open to anything that makes a reasonable amount of sense. We can get around the issue of long names via prefix matching, so goat pop
, po
, popul
, etc would be the same as goat populate
.
And I agree there would really be only one codepath. From the user standpoint there are two different use cases, but internally, "no .go.yaml" is really just a special case for ".go.yaml that has no deps in it". We're on the same page there.
As for deps/gen overlap, I think you're right. You understood my suggestion correctly but yours is probably better. My idea was "fetch as few deps as possible to resolve a complete tree", but then you have an inconsistency, where your .go.yaml
would be represent a complete dependency tree, but your filesystem is actually incomplete and only contains some of the deps. That's kind of confusing... Calling deps at the end to fill in any gaps is a more sensible default because it ensures that your .go.yaml
actually reflects not only your code, but also the packages you have locally on disk. So yeah, I agree with you. gen should call deps at the end, maybe with flag to disable for the nitpickers.
I've got some end-of-term wrap up before my next semester starts but I'll see if I can get some code on the table for this command. @pib if you want to revive this PR, just let me know so we don't duplicate work.
Oh hey, I've had the email notifications for this thread sitting in my inbox for a couple of days now, and I just haven't had a chance to even read through the comments. The Go project I was working on when I first submitted this pull request has been done for a while and I haven't had a chance to do much Go until recently at a new job.
I definitely think this is a feature goat should have, especially as a tool for getting people who haven't been using it up and running quickly.
I'll start from the oldest comment and work my way to the most recent, starting with @mediocregopher's last comment from January :P
The main reason for downloading the packages in the gen step is so that those packages' dependencies can also be determined if they aren't using goat. I can't remember the original reason I implemented the -update flag, but I'm assuming I ran into a use-case that made me think it was a good idea :)
@tummychow Looking at scenario 1, it sounds fairly reasonable. The one thing I think I disagree with is that it's ever really safe to leave a dependency as type goget
, since that's just asking for something to break in the future. Also, I don't really like the idea of having to run goat gen
multiple times just to get the full list of dependencies. My previous suggestion of having goat gen
actually call goat get
on missing dependencies would solve that issue and allow it to gather all the dependencies and nested dependencies in a single go. Then it can also detect what type of dependency it actually is (i.e. git
or hg
), and save it as that type. I really think that using a type of goget
should be discouraged since it's basically counter to the entire point of goat in the first place.
As for scenario 2, I really like the idea of just passing in the path of the current package to goat gen
. It's a nice elegant solution to that. I also agree that having it overwrite the package path if you specify a different one sounds reasonable.
As for the --update
flag, yeah, all that did was prevent you from running goat gen
again by accident and overwriting your .go.yaml
file, but I'm not sure that's really needed anyway.
The idea behind the freeze
command is that you might know you want to install a package, but not know exactly what version of that package you need, so you can add it to your .go.yaml
file without specifying a version, and then you run goat deps
and then once you've verified that the current version works with your code, you can run goat freeze
and it will update the .go.yaml
file with the current version, "freezing" it in place so that a later change to that project doesn't break your code when you run goat deps
on a fresh checkout. This is also the reason I'm so against having goget
type dependencies be the default, since that will result in similar situations.
The name freeze
just came from pip freeze
which gives you a list of the current versions of all the currently-installed packages.
Also, yes, the goat freeze
command could come in handy when you are converting a non-goat project to a goat project, allowing you to take your whole set of currently known-good versions of dependencies and save them into your .go.yaml
so other checkouts of the same code will be sure to get the exact same versions you currently have.
Also also, the reason that freeze only works for type: git
and type: hg
is that those are the only versioned types currently supported by goat. I don't think I've ever seen many (or perhaps any) Go packages that weren't either in git or hg, so once you've checked a package out, even via goat get
, you will be able to detect the type of the package and goat freeze
will work as expected.
@mediocregopher I agree that gen
isn't the best choice. What about something like fill
, since that doesn't imply that it is just initializing or just updating.
I guess the one thing I left out is the case when a dep is neither a git
nor an hg
repo, in which case, a goget
dep would have to be used.
So, with that in mind, I like the proposal, with two proposed modifications:
goat gen/fill/whatever
command, if a dependency is not found on the path, a goat get
of that dependency is done.goat get
if needed) its type is detected and it is stored with the appropriate type and type-specific URL (I already implemented that in my pull request).I do like the idea of freeze being a separate step from all this, but perhaps adding it as a flag would make sense instead. Something like goat gen --add-versions
, and it would only add versions to those dependencies that don't already have one in .go.yaml
. Again, this can be held off for a separate proposal if that makes more sense.
I started typing this before you added that extra comment, but personally I'm not concerned about svn or bzr repos. For whatever it's worth, most of the competition only supports hg/git as well, and sometimes one of svn or bzr. There are a few managers (godep, gvp/gpm) that download entirely via go get so they have support for all four. But, as you said, the number of svn/bzr users pales in comparison to git or hg. For now we can write it off to a future feature.
My problem with resolving every package to type: git
or type: hg
(from here on out I'll just say git for brevity) is that you are now responsible for the version of every single package in your dependency tree, not just your immediate dependencies. For a large dep tree, that results in a massive .go.yaml
. It's hard to tell where to begin when faced with a tree that has too many nodes in it, especially if I'm trying to isolate separate subtrees of my dependency graph.
The problem is intractable right now because Go has no de facto metadata for transitive dependencies, so if we want a clearly defined dependency graph then the only way to get it is by listing every single version of every single node. But I don't think all dependencies are equally demanding of attention. If I'm going to be in dependency hell anyway, then let me focus my efforts on the packages that are most fragile. Some things are like rails and they'll explode on every version bump, but other things are like go-yaml and they'll probably be backwards compatible forever. I can use type: goget
for the dependencies that are more stable or less active.
For me, that is the heart of the issue. If I have a type: goget
dependency and it's too loose for me, I can always change it to type: git
and fix its version, then add each of its dependencies to the dep list as well. But if I have a lot of type: git
dependencies, it's harder to go in the other direction: collapsing a group of type: git
deps into a single type: goget
subtree will require a lot of trial and error. I'd rather collapse (almost) fully by default and let the user expand, than expand fully by default and let the user collapse. I think the former direction is easier for the user than the latter.
Finally, I think the real solution to your concerns would be an implementation of #24, which would eliminate the looseness of type: goget
. That would provide a balance between type: goget
as it is right now (resolves the entire dep tree, but have fun keeping track of which version you're using) and type: git
or type: hg
(precise version fixing, but now you have to find the dep tree yourself). If type: goget
supported ref, then you could have a fixed version of the top-level dep and still get (unversioned) resolution for the subtree. Until then, I'm prepared to compromise... either default is fine with me as long as we're open to a flag to switch to the other behavior.
Or we can take the third option - block on #24 and implement this as a followup, so that we can integrate the behavior of #24 from the ground up. But if you're waiting for me to implement #24 myself, I might be a while...
I'm all for this idea, but I am against it overwriting the .go.yaml file. That should contain only the dependencies a program needs, put there by a human being. If goat overwrites it, adding all resolved dependencies, it's going to get computerized --order of entries will change, comments will be lost, etc. So instead it needs to generate a separate file. And seeing that there is no reason to be cluttering up a projects root directory --and goat is arguably already taking up one more path than it needs by having both .go.yaml
and .goat/
there-- this separate file can go in the .goat/ directory.
@tummychow that is a good point about generally stable packages, but I'm still worried about the case when something changes and it takes you a while to track down where it broke. I left a comment on #24 that could partially cover the issue of only depending on a sub-package and not wanting all the dependencies in the entire tree of packages in that repo.
@trans I get what you mean, and I agree that it is a bit overkill to have to specify the full dependency tree when really all you want is to specify the dependencies of your project.
This is the reason for bundler's Gemfile.lock and Berkshelf's Berksfile.lock, which is locking the dependencies and all sub-dependencies to specific known-good versions without filling the user-specified Gemfile or Berksfile with things they don't directly specify. That was the general idea behind goat freeze
, but I wasn't familiar with bundler or berkshelf at that point, so I hadn't seen the idea of keeping them in a separate file that is also read by the dependency management system.
If a freeze/lock file went into the .goat
directory, would it also make sense to put the .go.yaml
file in there? The idea would be that you check in the freeze file as well, otherwise it would be pointless to generate it in the first place, so that would mean that .goat
would have to be removed from people's .gitignore
files (and .goat/deps
added instead?).
@trans you might find [goop][https://github.com/nitrous-io/goop) to be a better fit for your approach. Personally I prefer having a complete view in .go.yaml
. If I don't want to specify dependencies-of-dependencies, then I have to either depend on #24 or hope for more people to adopt a version metadata format along the lines of .go.yaml
.
Man, I need to keep up with this thread better :P
fill
. Unless there's a better suggestion let's stick with that (although we can keep saying gen
if that makes the discussion easier)goget
unless the user decides otherwise. While it would be cleaner to determine that actual type and lock the version in all cases, this does end up creating a gigantic and unmaintanable .go.yaml
file, which some people may want/need, but I'm pretty sure most are just trying to lock down one or two deps and say to-hell-with-it for the rest.freeze
comments I think I understand a bit better why it might be useful. If all the dependencies are specified as goget
, but someone wants to lock all the versions, essentially turning them into git
types. So while I agree it's useful, I don't think it warrants a whole new command (which isn't to say we can't roll it in with fill
), since like I mentioned I think it's a minority case that someone wants to lock every dependency..go.yaml
is the user facing side of goat, and .goat
is the magic black box that they never have to touch. Any changes make things more complex and cluttered, as well as probably not being backwards compatible. We should avoid modifying that setup except as an extremely last resort. I'd like to avoid adding new .go.yaml
type files to the root of the project too, if we can..go.yaml
file is less than ideal, primarily (imo) because it gets rid of comments, which are definitely useful. I'm not as concerned about any ordering the user might have put the deps in that might get overwritten, but maybe others care more. I don't really have a good solution for this, and it's probably the primary problem we need to come up with a solution for. I don't suppose there's any yaml libraries that magically preserve comments?So, after all that, here's my thinking: the fundamental problem/disagreement seems to be of whether to goget
deps that we haven't specified and let fate take them as it will, or to lock them down so we can wrangle them in, or something in between. My thinking is that we make this an interactive command, where each step could be pre-answered using command-line flags. So the process, from the user's perspective would be something like:
path
, if it wasn't provided on the cligoget
or locked to ref .go.yaml
to stdout, unless the cli flag to auto-write it is given. I'm not totally sold on this step, because if the user decides to bail on the one they generated then they are left with a .goat/deps
full of dependencies they maybe don't actually want. So maybe fill
should pull dependencies into a different directory in .goat
? Again, not sold.I think that covers everyone's use-cases. I like the idea of the interactive prompt (as long as it can be bypassed with cli flags if needed) because:
.go.yaml
, and should really really avoid it for the cli, so this is valuable.So let me know what you guys think, but that's the best solution I have so-far.
Still think freeze should be deferred to a future discussion. This one is clearly contentious enough as it is... No further comment on the subject from me.
I've got no interest in the interactive CLI part of it, but as long as I can override it, it won't get in my way. I'd personally prefer to tie interactive to a flag and have the non-interactive side be the default behavior, but there's no point getting divisive over minor CLI arguments. It's all the same to me.
As for the issue of not overwriting .go.yaml
, I think it's obvious that every time we try to avoid overwriting the existing contents, the problem gets significantly uglier. Install deps into a different directory temporarily unless the user agrees to overwrite their file? Append the new deps to the end of .go.yaml
via a custom YAML generator? Write a comments-aware YAML parser? All these "solutions" are functionally incorrect, expensive to maintain, hard to justify, and introduce all kinds of dumb edge cases. Right now goat is simple enough that I can get over its lack of tests... introduce any of these behaviors and that will rapidly go out the window.
Do the simplest thing that can possibly work: overwrite the file every time. The command is supposed to modify .go.yaml
programmatically; I don't see how you could expect comments to be preserved when the thing is going to be computer generated.
And if we're going to go with the interactive-by-default approach, we can just jam a warning into the start of the command to let you know that your existing .go.yaml
will be rewritten. That way nobody should be surprised.
The command is supposed to modify .go.yaml programmatically
Not really. The command is supposed to generate .go.yaml
programatically. What it does with that generated file is yet to be determined. But I'd like to keep behavior consistent with the other go tools, like fmt and vet, which don't write anything to disk by default, just dump to stdout (with the option to write on the cli).
Goat's lack of tests is definitely something which needs to be addressed, probably before this is.
That being said, after talking with a friend of mine who also uses goat regularly, he pointed out that we'd have to put deps that are being pulled in by fill
in a temporary directory regardless of whether .go.yaml
gets written in the end, since the user could potentially ctrl-C in the middle, leaving their environment "dirty". So with that in mind, I think my idea to ask the user whether to dump to stdout or write would be the way to go, since we have to do the temporary directory thing no matter what.
These changes make it easier for folks to start using goat and to update dependencies without having to hand-edit their .go.yaml file.