aquasecurity / fanal

Static Analysis Library for Containers
Apache License 2.0
199 stars 100 forks source link

feat(golang): add support for go.mod #465

Closed knqyf263 closed 2 years ago

knqyf263 commented 2 years ago

Description

Go 1.17+ adds indirect dependencies to go.mod. This PR parses go.mod as well as go.sum and merges them according to the Go version in go.mod.

It reduces unused dependencies as shown below.

Before

$ ./fanal fs --skip-dirs analyzer --skip-dirs artifact --skip-dirs config .
...
gomod (go.sum): 697

After

$ ./fanal fs --skip-dirs analyzer --skip-dirs artifact --skip-dirs config .
...
gomod (go.mod): 164

Issue

PR

knqyf263 commented 2 years ago

@jerbob92 It would be really appreciated if you take a look.

knqyf263 commented 2 years ago

After I played a bit, a new question came to my mind.

$ head -n 4 go.mod
module github.com/aquasecurity/fanal

go 1.16

$ grep "cloud.google.com/go" go.mod // it is an indirect dependency
$ grep "cloud.google.com/go" go.sum | grep -v go.mod | grep -v storage
cloud.google.com/go v0.99.0 h1:y/cM2iqGgGi5D5DQZl6D9STN/3dR/Vx5Mp8s752oJTY=
$ go build -o fanal cmd/fanal/main.go
$ go version -m fanal | grep cloud.google.com/go | grep -v storage
        dep     cloud.google.com/go     v0.99.0 h1:y/cM2iqGgGi5D5DQZl6D9STN/3dR/Vx5Mp8s752oJTY=

Add replace without version

$ tail -n 1 go.mod
replace cloud.google.com/go => cloud.google.com/go v0.98.0
$ go mod tidy
$ grep "cloud.google.com/go" go.sum | grep -v go.mod | grep -v storage
cloud.google.com/go v0.98.0 h1:w6LozQJyDDEyhf64Uusu1LCcnLt0I1VMLiJC2kV+eXk=
$ go build -o fanal cmd/fanal/main.go
$ go version -m fanal | grep cloud.google.com/go | grep -v storage
        dep     cloud.google.com/go     v0.99.0
        =>      cloud.google.com/go     v0.98.0 h1:w6LozQJyDDEyhf64Uusu1LCcnLt0I1VMLiJC2kV+eXk=

Add replace with version

$ tail -n 1 go.mod
replace cloud.google.com/go v0.99.0 => cloud.google.com/go v0.98.0
$ go mod tidy
$ grep "cloud.google.com/go" go.sum | grep -v go.mod | grep -v storage
cloud.google.com/go v0.98.0 h1:w6LozQJyDDEyhf64Uusu1LCcnLt0I1VMLiJC2kV+eXk=
$ go build -o fanal cmd/fanal/main.go
$ go version -m fanal | grep cloud.google.com/go | grep -v storage
        dep     cloud.google.com/go     v0.99.0
        =>      cloud.google.com/go     v0.98.0 h1:w6LozQJyDDEyhf64Uusu1LCcnLt0I1VMLiJC2kV+eXk=

Looks like go.sum shows the correct replaced versions. I'm wondering when we cannot capture the correct version from go.sum.

I'd say it would still be good to use the go.mod as base for the dependency versions, so that it handles the replace directives correctly https://github.com/aquasecurity/go-dep-parser/issues/75#issuecomment-1096357615

@jerbob92 Could you give me an example of when it cannot handle the replaced version?

jerbob92 commented 2 years ago

@jerbob92 Could you give me an example of when it cannot handle the replaced version?

I think go mod tidy only keeps the actively used version in the go.sum file, so for projects that always run go mod tidy, it would contain the correct version, but that's not something that's guaranteed.

knqyf263 commented 2 years ago

Hmm, but it is the same situation in go.mod, right? If you remove unnecessary dependency from source code and you don't run go mod tidy, the dependency will remain in go.mod. It would be a problem anyway if users don't run go mod tidy. I think we can assume users run go mod tidy. What do you think?

jerbob92 commented 2 years ago

Any Go command uses go.mod as ground thruth for dependency resolving, and go.sum for validating the hashes. If you add or update a package it appends to go.sum, it doesn't clean it up. So I would say: if you can't do actual dependency resolving, using go.mod for versions is always better than go.sum, even if go.sum is probably correct in most cases.

knqyf263 commented 2 years ago

If you add or update a package it appends to go.sum, it doesn't clean it up.

I'm still feeling like go.mod is the same in many cases. After you switch gopkg.in/yaml.v2 to gopkg.in/yaml.v3 by running go get gopkg.in/yaml.v3, gopkg.in/yaml.v2 will remain in go.mod. Also, we need go.sum anyway for a complete list of indirect dependencies even though it includes dependencies that are actually unneeded.

I agree that go.mod is slightly better in terms of accuracy. If you downgrade a version in go.mod from v2.4.0 to v2.3.0, go.sum will show v2.4.0 by mistake and go.mod shows v2.3.0 correctly.

The benefit of taking only go.sum is simple implementation and users easily understand how it works. If they see any issue, they can check go.sum themselves. I'm not sure how important the difference between go.mod and go.sum is since the result would be the same after go mod tidy. But let's keep it as is. I just wanted to write down what we considered for the record.

Thanks.