loov / goda

Go Dependency Analysis toolkit
MIT License
1.39k stars 45 forks source link

Breaking module dependency cycles #15

Open nim-nim opened 5 years ago

nim-nim commented 5 years ago

Hi,

I don't know if it's a RFE or if the functionality already exists in goda, without good documentation

Dependency cycles are the curse of software integration. When A depends on B and B depends directly or indirectly on A, it's no longer possible to define a clean modular integration plan where each component is tested after the ones it depends on, you have to import the whole cycle in bulk, which makes in impossible to split tasks between different teams, and scale your organization.

Therefore, as an integrator, I would be interested in a command, that tells me if it is possible to split the code tree of A, in

With A being a tree of Go packages (typically, nowadays, the packages included in a Go module).

If it possible to do this with Goda? I already have the code to walk a source code directory and identify all the module trees within (I can share it if you want, it's GPLv3, but relicensing it is no problem).

egonelbre commented 5 years ago

There isn't a feature for it in goda. Listing out my initial thoughts.

In Go you cannot directly create a package dependency A -> B and B -> A, because that would cause an circular dependency. Unless I misunderstood, by what you mean by that scenario.

So, the question is more about indirect dependencies. If there's such a dependency it usually means that B requires A to work, so to break the dependency it would need to move some code from B to A. Alternatively, introduce C that can replace A in the context of B, which in some cases might be the same amount of work as writing A (and all of it's dependencies) in the first place.

Automatically doing something here sounds like asking for a lot of trouble. I guess the only thing left would be to visualize the relation, somehow.

One option would be to use goda graph A - B:noroot, this would display how A and B are related (at package granularity). However, it seems like that it would need to work at a finer granularity to be useful.

Do you have real-world examples of such problems and how you solved them?

nim-nim commented 5 years ago

@egonelbre

Thanks for taking the time to think about it. Your misunderstanding stems from the fact you think in isolated package terms. Individual packages are utterly uninteresting for integrators. Integrators integrate projects, ie package sets, not individual packages. The Go compiler and Go developers tried to ignore this via GOPATH package soups with no clear limits between sets of packages, and the only thing that produced are gigantic GOPATHS no one knows how to manage and test and audit sanely anymore.

That's basically why Google defined Go modules: define a clear packageset boundary, so Go projects can start manipulating sets of packages, and only have to worry about the dep graph between those sets of packages.

Unfortunately due to all the GOPATH history a lot of Go modules do not have the boundaries at the right place, so you can have circular dependencies at the module level, that you do not have at the package level.

If goda could help all the Go projects which are modularizing right now to compute correct cycle-less boundaries for their modules (ie should they break up their projects in several modules, and is so, where is the most convenient place to put the module limits via separate go.mod files), it would be a huge help for the Go ecosystem.

Well-known packageset cycles are cloud.google.com/go → golang/x/auth → cloud.google.com/go for example. Or (in the same project) the braindamaged decision to add opencensus plugins, that depend on things, that already depend on cloud.google.com/go.

Or all the projects that decide to add tests, to test they still work with their children. That's not a huge problem as long as it is easy to identify easily all those tests and put them in a separate test-children module, the children are forbidden to depend on.

egonelbre commented 5 years ago

Ah, you mean for detecting module cycles.

One of the things I have planned is visualizing things at the module level. And mixed mode (e.g. this module depends on these packages and vice-versa).

As for the specific solution, it could have a strongly connected component detection at the module level and then visualize that as a graph. Something like goda graph packages(scc(modules(yourproject))), where module and package are funcs to either collapse/expand graph into modules vs. packages. Although, that would require some rethinking how to represent the dependency graph.

As for detecting the convenient place to put go.mod or the "correct cycle-less boundaries". I could do something like a minimal vertex cut, however I'm not sure whether that's the right approach. This might lead people to mindlessly creating modules that would be a hell to maintain for both integrators and developers.

nim-nim commented 5 years ago

Detecting cycle-free module boundaries is just the first step, I agree. If the limits are too exotic or complex that strongly hints the codebase needs restructuring (and code analysis tools can help, but, ultimately, code is written by humans).

However, this has all become a giant haystack. The first step to put some order in there is just to detect where the boundaries should be with the code as it exists today. If the boundaries are simple, that's just adding a couple go.mod files in upstream Go projects, no huge burden on devs, a huge simplification for integrators and other devs that can now consume cycle-free modules. Most devs do not create module cycles willingly, they just happen, and they can't be bothered to search for module splitting points manually.

For codebases that will need restructuring, there is no simple solution. But they need restructuring because the cycle problem has been left to rot without detection too long. goda can not magically fix those projects, but it can help avoiding the creation of new dep hairballs, just by detecting their first stages.

dolmen commented 3 years ago

@nim-nim This go list command can list the modules used in a final binary. It gives a flat list of modules with the real version selected.

go list -deps -f '{{define "M"}}{{.Path}}@{{.Version}}{{end}}{{with .Module}}{{if not .Main}}{{if .Replace}}{{template "M" .Replace}}{{else}}{{template "M" .}}{{end}}{{end}}{{end}}' | sort -u

That might be a good starting point for your test plan.

egonelbre commented 2 years ago

This was mentioned in #tools channel https://codereview.appspot.com/186270043/. It could be helpful for figuring out a solution for this.