mtibben / gogpm

Barebones dependency manager for Go.
BSD 3-Clause "New" or "Revised" License
7 stars 0 forks source link

`gogpm install` should be faster when no action is required #5

Open mtibben opened 10 years ago

mtibben commented 10 years ago

gogpm install is slow because it needs to discover the VCS of each listed package by making HTTP requests.

gogpm is currently using the the VCS discovery logic from go get which leads to this issue.

But could we determine the VCS without needing to touch the network?

gpm "guesses" the VCS by looking for .git .hg directories. But this is not technically correct because for example a git repo might actually have a .hg directory. Of course that is an edge-case, but it points to the fact that guessing could lead to unintended side effects.

I wonder if we should embed the VCS metadata in the Godeps file. For example something like:

bitbucket.org/liamstask/goose           hg:bb8240b815dff5719b11b1ef10b8f4430d3d6a05
code.google.com/p/go.text               hg:e4132d00dbd1
github.com/abh/geoip                    git:da130741c8ed2052f5f455d56e552f2e997e1ce9
mtibben commented 10 years ago

Hey @pote do you have any thoughts on the right approach for this ?

pote commented 10 years ago

Well I'm obviously biased towards a solution, so take my opinion with a grain of salt :)

The way I see it we have 3 options:

1) Add metadata to the Godeps file 2) Make HTTP requests to discover the type of VCS 3) Assume that .git/.hg/.bzr/.svn directories imply a type of VCS and work based on that.

Option 1 adds a small degree of complexity to the Godeps file. One of the design considerations of gpm is that the Godeps file should be something any other competing tool could adopt or leverage and that has a bigger chance of happening the simpler the file format is. Interoperability is a good thing, so I probably wouldn't go with this option. If anything I'd try to discover the VCS from the import path URLs, maybe.

Option 2 adds a small degree of complexity to the tool as well as a taking performance toll due to the HTTP requests involved, while it's true that technically that is a more trustworthy way to determine the VCS there is no way around the performance bottleneck of hitting the network.

In my experience option 3 has worked well for the entirety of reported use cases - never to my knowledge a user has hit the edge case, in the unlikely event that they do though my hope is that the code is small and simple enough to yield a quick answer to weird side effects if they present themselves.

In summary: I think options 1 and 2 are basically a tradeoff that affect the use cases of 99% of the user base in order to adapt to the use case of a 1% that is very, very misguided about VCSs and what kind of hidden directories they put in their repos, which is the reason option 3 is taken in gpm.

Ok, I'll stop ranting, but hope you find it useful. :)

mtibben commented 10 years ago

Hey thanks for commenting @pote!

Option 1...If anything I'd try to discover the VCS from the import path URLs, maybe.

Yes that's how go get works, and I've ripped the code from go get to make this tool 100% compatible. The discovery works well for github.com import paths, because the VCS always resolves to git. But other import paths require a HTTP request to discover the VCS, which leads to a small slowdown due to the HTTP overhead. (This is slowdown is very small, but as I'm using this tool as part of a build script, it actually becomes noticeable)

In my experience option 3 has worked well for the entirety of reported use cases

I understand the pragmatism of this approach, but I want to build this tool with 100% compatibility with go get which rules it out

Because we already have the VCS information when doing the bootstrap, to me it still makes sense to add this VCS metadata in the Godeps file... I just would like to do this in the least complex way possible - I love the simplicity of Godeps and am l loathe to introduce any complexity.

mtibben commented 10 years ago

What do you think of the git:da130741c8ed2052f5f455d56e552f2e997e1ce9 notation? I wonder if there is any standard for notating a VCS revision

mtibben commented 10 years ago

Making install concurrent really speeds things up https://github.com/mtibben/gogpm/pull/10