golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.13k stars 17.56k forks source link

proposal: cmd/go: partial vendoring #52604

Open johnwmstevens opened 2 years ago

johnwmstevens commented 2 years ago

The go mod vendor capability is great for insulating a working program from changes made in imported modules.

It would benefit from having the ability to download only those modules that are imported from external entities.

Within a single entity, every project that uses a popular module and the vendor capability will store a copy of the same code. With multiple projects referencing this popular module, multiple copies will be stored in the owning entities' source code repository.

Modifying the vendoring capability to download only modules owned by external entities would be a useful change. Reuse of the GOPRIVATE env var might make sense, but if overloading the semantics of this variable is problematic a new one could be added or a switch added to the command.

seankhliao commented 2 years ago

previously #30240

ianlancetaylor commented 2 years ago

CC @bcmills @matloob

rsc commented 2 years ago

This proposal has been added to the active column of the proposals project and will now be reviewed at the weekly proposal review meetings. — rsc for the proposal review group

bcmills commented 2 years ago

I think partial vendoring is a fine concept, but I wouldn't want to key it on GOPRIVATE: some folks will want to vendor only the private dependencies, while others will want to vendor everything but those.

However, there is one added wrinkle: the vendor directory does not include go.mod files. It was hit-or-miss until CL 315410 (#42970), and now it's completely consistent but in the direction of not including them. So we wouldn't be able to resolve the full module graph in the presence of partially-vendored dependencies (because we wouldn't have access to the requirements of those dependencies unless ).

...but that's probably ok, given lazy module loading! We would at least still be able to load the dependencies from modules whose versions are listed explicitly in the go.mod file, which should cover all of the packages that would have been available with a full vendor tree anyway.

johnwmstevens commented 2 years ago

I'm not fixated on any particular solution, GOPRIVATE was just an off the top of my head suggestion.

My primary goal is to be able to use the vendor capability in an organizational entity (MyOrg) to draw a line between modules "owned" by other entities (IOW: vendors) and those owned by MyOrg.

This, in order to protect the viability of a critical Go program that depends on stuff owned by "someone else" where it is possible that that other stuff may just up and disappear.

The secondary benefit is being able to avoid multiple copies of MyOrg modules in MyOrg's SCRaM system, which should make rolling out critical bug fixes in heavily reused MyOrg modules safer and easier.

flibustenet commented 2 years ago

Vendoring only private modules would be fine. On some PAAS (Cloud Run for example) it's difficult to include private modules, it could be find to vendor only them.

hherman1 commented 2 years ago

Is the point of this to save disk space? Why is it useful?

flibustenet commented 2 years ago

@hherman1 for my use case i can commit my very little private module but i would not like to pollute my code history with all the other dependencies.

johnwmstevens commented 2 years ago

Is the point of this to save disk space? Why is it useful?

Copying code owned by entity B protects entity A from being unable to compile their Go program because entity B deleted their repository.

Copying one's own code into multiple projects does indeed waste disc space, but also makes search, tracking, management and maintenance more complicated.

Programs owned by A that reuse a module also owned by A become more difficult to maintain by requiring a manual step to update the copy of the reused module in each project.

Assume, say, fifty programs owned by A that all reuse module X which is also owned by A. Fifty copies of that module exist in the SCRaM system and there you have your wasted disc space.

Also, to update each program when a bug in module X is fixed is easy if all fifty projects refer to the source repository of module X, more difficult when each project must be manually updated to copy the new module X code into all fifty projects.

Certain kinds of search in that SCRaM system will also return fifty results, not just one, because of the multiple copies.

The ability to distinguish between owned and not owned code when making a "vendor" copy also assists in management of other entity bounded operations, such as license change, tracking and management.

hherman1 commented 2 years ago

If I understand correctly the manual step of updating censored code doesn’t seem so different from the manual step of incrementing the dependency version in go.mod, so it seems like you would still have the problem you’re describing even with partial vendoring.

rsc commented 2 years ago

Talked to @bcmills and @matloob. Probably this should be done with a design like #30240, not tied to GOPRIVATE. And probably it should answer what vendoring means in workspace mode too. But we may not have time to work out a good design right now.

johnwmstevens commented 2 years ago

I'm not sure what you mean by censored code (versioned code?) but in our production environment it is relatively easy to update the module version of self owned modules across the whole repository.

That said, is there a way to specify a version in go.mod such that, say, 1.2.* will match any version that starts with 1.2?

In any event, module version update is just one of four issues I raised above.

Perhaps the better solution is to just stop using vendoring and instead support this requirement in another system such as a read through cache that never evicts unless explicitly told to do so.

rsc commented 2 years ago

Sounds like we should put this on hold for bandwidth from the cmd/go team.

rsc commented 2 years ago

Placed on hold. — rsc for the proposal review group