oss-review-toolkit / ort

A suite of tools to automate software compliance checks.
https://oss-review-toolkit.org
Apache License 2.0
1.52k stars 298 forks source link

Golang list of dependencies #7737

Open blaumeiser-at-bosch opened 8 months ago

blaumeiser-at-bosch commented 8 months ago

I have a question concerning the handling of go. The point is, that a compiled go file contains the runtime, i.e., the oss dependencies of the project should contain the runtime packages as well as they are distribution relevant.

When I look at the output of ORT, it basically reflects the dependencies referenced in the go.mod file. To my understanding, this does not include the runtime of Go, but only the additional dependencies. If I call 'go list -m all' I get an even longer list of modules.

Is this an issue or do I misunderstand something here?

sschuberth commented 8 months ago

@fviernau probably is the best to answer this (once he's back from vacation next week).

fviernau commented 8 months ago

Hi @blaumeiser-at-bosch ,

This is interesting! Could you please provide

  1. The list (names) of the runtime packages you would expect to be included?
  2. Ideally also a minimal project which reproduces the missing packages? (with Go 1.21.0)
  3. Whether go mod why indicates that these runtime packages are needed by the main module?
blaumeiser-at-bosch commented 7 months ago

Hello @fviernau,

I did some small investigation with a Hello World example. There "go list -m all" only shows the module just created for the program. If I add a dependency, only the module itself and the dependency is shown. In our well grown project, I get 136 dependencies from this command with 34 modules referenced in the "go.mod" file. I used "go mod why" for some of the not listed dependencies and the result was, that the dependency had to be downloaded first and then the message was: "(main module does not need package honnef.co/go/tools) ". I have no clue, where this difference comes from as go typically also adds transitive dependencies into the "go.mod" file.

Besides that, when thinking about the issue, my point was, in a language like Java, if I do not distribute the JVM, all the standard dependencies are not part of my distribution, so it is ok to only take the input from the package manager for license fulfillment. In a language like go, where one statically linked executable is created, the runtime is part of the distribution, also the standard library packages like "fmt". But they are not mentioned by the package manager and they are included automatically by the tooling. So from a distribution perspective, these things have to be added to the referenced oss software, right? I have no idea, how to automatically find out, which standard packages there are and which are added to the binary.

fviernau commented 7 months ago

Thanks @blaumeiser-at-bosch !

I used "go mod why" for some of the not listed dependencies and the result was, that the dependency had to be downloaded first and then the message was: "(main module does not need package honnef.co/go/tools) ". I have no clue, where this difference comes from as go typically also adds transitive dependencies into the "go.mod" file.

In general, I believe that the output does detect whether code is actually used. E.g. If you add a library as dependency you do not use, then go mod why would tell it is not needed. At the minimum go mo tidy would remove it again from the g.mod file. So, in your example has there been code usage of the mentioned libs.

Anyhow, it's really difficult for me to look into this without having it locally reproducible. Would you mind providing a minimal example which is sufficient to work on this ? (A requirement for such example would be that the go.mod file is aligned with what go mod tidy makes out of it).

blaumeiser-at-bosch commented 7 months ago

Hi @fviernau, good point, the go.sum file contains these components, which are not referenced in the go.mod file. Even a "go mod tidy" did not change the content of the go.sum file. I can share the go.mod and go.sum file of our project, if that would help.

I just read some posts, that the go.sum file contains more dependencies needed by the dependency resolution and adding some transitive dependencies that are not necessarily required for runtime, but potentially for testing.

But still, my main concern is about the standard library and runtime that goes into the generated binary. This is not referred by any means, but still it is included in the binary, right?

fviernau commented 7 months ago

Hi @fviernau, good point, the go.sum file contains these components, which are not referenced in the go.mod file. Even a "go mod tidy" did not change the content of the go.sum file. I can share the go.mod and go.sum file of our project, if that would help.

Yes, that would help.

But still, my main concern is about the standard library and runtime that goes into the generated binary. This is not referred by any means, but still it is included in the binary, right?

So, what would help me here (as I'm not a go developer) is a list of modules you expected to see-, but are not in the dependency tree.

blaumeiser-at-bosch commented 7 months ago

Had to rename the files to get them here

go.mod.txt go.sum.txt

Concerning the exact list of what is missing. I have no clue, I understand actually too less from the go build process to know what is put into a go binary, but my shallow knowledge indicates that there might be something in the binary, that is not reflected by the package management. Is there any go expert here in the community that could have an idea?

sschuberth commented 7 months ago

Ping @haikoschol!

haikoschol commented 7 months ago

@blaumeiser-at-bosch The TL;DR is that any other runtime code the Go toolchain includes in the binaries that it builds is BSD licensed and copyrighted by "The Go Authors": https://go.dev/LICENSE

So essentially just include that in your NOTICE file or whatever other appropriate documentation you're distributing with the binary. (ORT can generate NOTICE files, right? Should this be hardcoded for Go projects then?)

All version of the Go standard library and toolchain so far have used the same license and copyright. I assume it will stay that way forever, but I couldn't find any statement regarding that.

There is some discussion of this topic in golang/go#19893 and here.

Regarding the issue what ORT includes as dependencies: The standard library is a collection of packages, but not a module. It predates modules and is not considered an external dependency of your code. What version of the standard library ends up in your binary depends on the version of the toolchain you use to build it.

When I look at the output of ORT, it basically reflects the dependencies referenced in the go.mod file. To my understanding, this does not include the runtime of Go, but only the additional dependencies. If I call 'go list -m all' I get an even longer list of modules.

Do you mean go list -m all returns more modules than what ORT lists as dependencies? I'm pretty sure those should be the same, so maybe there's a bug in the Analyzer.

blaumeiser-at-bosch commented 7 months ago

Thanks, @haikoschol for the information first of all!

(ORT can generate NOTICE files, right? Should this be hardcoded for Go projects then?) That was actually my point, for Go projects, these are dependencies that should be part of the NOTICE file.

When I look at the output of ORT, it basically reflects the dependencies referenced in the go.mod file. To my understanding, this does not include the runtime of Go, but only the additional dependencies. If I call 'go list -m all' I get an even longer list of modules.

Do you mean go list -m all returns more modules than what ORT lists as dependencies? I'm pretty sure those should be the same, so maybe there's a bug in the Analyzer.

That is actually my observation. go list -m all returns the dependencies in the go.sum file as it seems, but there are more dependencies there at least in our project compared to go.mod. ORT returns only the dependencies found in the go.mod file. I attach the output of go list -m all here, as I wrote above, some of them when asked withgo mod why` are declared as not used by main.

cloud.google.com/go v0.110.7 cloud.google.com/go/bigquery v1.8.0 cloud.google.com/go/compute v1.23.0 cloud.google.com/go/compute/metadata v0.2.3 cloud.google.com/go/datastore v1.1.0 cloud.google.com/go/firestore v1.13.0 cloud.google.com/go/longrunning v0.5.1 cloud.google.com/go/pubsub v1.3.1 cloud.google.com/go/storage v1.14.0 dmitri.shuralyov.com/gpu/mtl v0.0.0-20190408044501-666a987793e9 github.com/BurntSushi/toml v0.3.1 github.com/BurntSushi/xgb v0.0.0-20160522181843-27f122750802 github.com/armon/go-metrics v0.4.1 github.com/bahlo/generic-list-go v0.2.0 github.com/buger/jsonparser v1.1.1 github.com/census-instrumentation/opencensus-proto v0.2.1 github.com/chigopher/pathlib v0.17.0 github.com/chzyer/logex v1.1.10 github.com/chzyer/readline v0.0.0-20180603132655-2972be24d48e github.com/chzyer/test v0.0.0-20180213035817-a1ea475d72b1 github.com/client9/misspell v0.3.4 github.com/cncf/udpa/go v0.0.0-20201120205902-5459f2c99403 github.com/coreos/go-semver v0.3.0 github.com/coreos/go-systemd/v22 v22.3.2 github.com/cpuguy83/go-md2man/v2 v2.0.3 github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc github.com/envoyproxy/go-control-plane v0.9.9-0.20201210154907-fd9021fe5dad github.com/envoyproxy/protoc-gen-validate v0.1.0 github.com/fatih/color v1.14.1 github.com/frankban/quicktest v1.14.4 github.com/fsnotify/fsnotify v1.7.0 github.com/go-gl/glfw v0.0.0-20190409004039-e6da0acd62b1 github.com/go-gl/glfw/v3.3/glfw v0.0.0-20200222043503-6f7a984d4dc4 github.com/gogo/protobuf v1.3.2 github.com/golang/glog v0.0.0-20160126235308-23def4e6c14b github.com/golang/groupcache v0.0.0-20210331224755-41bb18bfe9da github.com/golang/mock v1.4.4 github.com/golang/protobuf v1.5.3 github.com/google/btree v1.0.0 github.com/google/go-cmp v0.5.9 github.com/google/martian v2.1.0+incompatible github.com/google/martian/v3 v3.1.0 github.com/google/pprof v0.0.0-20201218002935-b9804c9f04c2 github.com/google/renameio v0.1.0 github.com/google/s2a-go v0.1.7 github.com/google/uuid v1.1.2 github.com/googleapis/enterprise-certificate-proxy v0.3.1 github.com/googleapis/gax-go/v2 v2.12.0 github.com/googleapis/google-cloud-go-testing v0.0.0-20200911160855-bcd43fbb19e8 github.com/hashicorp/consul/api v1.25.1 github.com/hashicorp/go-cleanhttp v0.5.2 github.com/hashicorp/go-hclog v1.5.0 github.com/hashicorp/go-immutable-radix v1.3.1 github.com/hashicorp/go-rootcerts v1.0.2 github.com/hashicorp/golang-lru v0.5.4 github.com/hashicorp/hcl v1.0.0 github.com/hashicorp/serf v0.10.1 github.com/iancoleman/orderedmap v0.3.0 github.com/ianlancetaylor/demangle v0.0.0-20200824232613-28f6c0f3b639 github.com/inconshreveable/mousetrap v1.1.0 github.com/invopop/jsonschema v0.12.0 github.com/invopop/yaml v0.2.0 github.com/josharian/intern v1.0.0 github.com/json-iterator/go v1.1.12 github.com/jstemmer/go-junit-report v0.9.1 github.com/kisielk/gotool v1.0.0 github.com/klauspost/compress v1.17.0 github.com/kr/fs v0.1.0 github.com/kr/pretty v0.3.1 github.com/kr/pty v1.1.1 github.com/kr/text v0.2.0 github.com/magiconair/properties v1.8.7 github.com/mailru/easyjson v0.7.7 github.com/mattn/go-colorable v0.1.13 github.com/mattn/go-isatty v0.0.17 github.com/minio/highwayhash v1.0.2 github.com/mitchellh/go-homedir v1.1.0 github.com/mitchellh/mapstructure v1.5.0 github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd github.com/modern-go/reflect2 v1.0.2 github.com/nats-io/jwt/v2 v2.4.1 github.com/nats-io/nats.go v1.30.2 github.com/nats-io/nkeys v0.4.5 github.com/nats-io/nuid v1.0.1 github.com/pelletier/go-toml/v2 v2.1.0 github.com/pkg/errors v0.9.1 github.com/pkg/sftp v1.13.1 github.com/pmezard/go-difflib v1.0.1-0.20181226105442-5d4384ee4fb2 github.com/prometheus/client_model v0.0.0-20190812154241-14fe0d1b01d4 github.com/rogpeppe/go-internal v1.9.0 github.com/russross/blackfriday/v2 v2.1.0 github.com/sagikazarmark/crypt v0.15.0 github.com/sagikazarmark/locafero v0.3.0 github.com/sagikazarmark/slog-shim v0.1.0 github.com/sourcegraph/conc v0.3.0 github.com/spf13/afero v1.10.0 github.com/spf13/cast v1.5.1 github.com/spf13/cobra v1.8.0 github.com/spf13/jwalterweatherman v1.1.0 github.com/spf13/pflag v1.0.5 github.com/spf13/viper v1.17.0 github.com/stretchr/objx v0.5.1 github.com/stretchr/testify v1.8.4 github.com/subosito/gotenv v1.6.0 github.com/wk8/go-ordered-map/v2 v2.1.8 github.com/xeipuuv/gojsonpointer v0.0.0-20190905194746-02993c407bfb github.com/xeipuuv/gojsonreference v0.0.0-20180127040603-bd5ef7bd5415 github.com/xeipuuv/gojsonschema v1.2.0 github.com/yuin/goldmark v1.2.1 go.etcd.io/etcd/api/v3 v3.5.9 go.etcd.io/etcd/client/pkg/v3 v3.5.9 go.etcd.io/etcd/client/v2 v2.305.9 go.etcd.io/etcd/client/v3 v3.5.9 go.opencensus.io v0.24.0 go.uber.org/atomic v1.11.0 go.uber.org/goleak v1.2.0 go.uber.org/multierr v1.11.0 go.uber.org/zap v1.26.0 golang.org/x/crypto v0.13.0 golang.org/x/exp v0.0.0-20231006140011-7918f672742d golang.org/x/image v0.0.0-20190802002840-cff245a6509b golang.org/x/lint v0.0.0-20201208152925-83fdc39ff7b5 golang.org/x/mobile v0.0.0-20190719004257-d2bd2a29d028 golang.org/x/mod v0.13.0 golang.org/x/net v0.15.0 golang.org/x/oauth2 v0.12.0 golang.org/x/sync v0.3.0 golang.org/x/sys v0.14.0 golang.org/x/term v0.0.0-20201126162022-7de9c90e9dd1 golang.org/x/text v0.14.0 golang.org/x/time v0.3.0 golang.org/x/tools v0.14.0 golang.org/x/xerrors v0.0.0-20220907171357-04be3eba64a2 google.golang.org/api v0.143.0 google.golang.org/appengine v1.6.7 google.golang.org/genproto v0.0.0-20230913181813-007df8e322eb google.golang.org/genproto/googleapis/api v0.0.0-20230913181813-007df8e322eb google.golang.org/genproto/googleapis/rpc v0.0.0-20230920204549-e6e6cdab5c13 google.golang.org/grpc v1.58.2 google.golang.org/protobuf v1.31.0 gopkg.in/check.v1 v1.0.0-20190902080502-41f04d3bba15 gopkg.in/errgo.v2 v2.1.0 gopkg.in/ini.v1 v1.67.0 gopkg.in/natefinch/lumberjack.v2 v2.2.1 gopkg.in/yaml.v2 v2.2.2 gopkg.in/yaml.v3 v3.0.1 honnef.co/go/tools v0.0.1-2020.1.4 rsc.io/binaryregexp v0.2.0 rsc.io/quote/v3 v3.1.0 rsc.io/sampler v1.3.0

haikoschol commented 7 months ago

go list -m all returns the dependencies in the go.sum file as it seems, but there are more dependencies there at least in our project compared to go.mod.

The go.sum file is a distraction in this context. It is not a lockfile as they exist in other dependency management systems. It is used for integrity checking and is essentially append only. Whenever you update a dependency, the sum file will still contain references to the previous version(s) of that dependency.

haikoschol commented 7 months ago

ORT returns only the dependencies found in the go.mod file. I attach the output of go list -m all here, as I wrote above, some of them when asked withgo mod why` are declared as not used by main.

A common source of confusion in the context of Go dependency management is “packages” vs “modules”.

The purpose of packages is to enable encapsulation and reusability. Code in a package is invoked using its import path. The import path can either refer to a package in the standard library, one on the local filesystem (including in $GOPATH pre-modules) or in a remote repository. In the latter case, the Go toolchain conveniently downloaded the code to $GOPATH/somewhere, even before modules existed. But this could not really be called dependency management.

From the output of go help modules:

A module is a collection of packages that are released, versioned, and distributed together.

Dependency management in Go operates on modules.

The various commands in the Go toolchain are quite confusing unfortunately. go list all lists packages. go list -m all lists modules. That makes sense. However, go mod why takes a list of packages and go mod -m why takes a list of modules. In the latter case it “finds a path to any package in each of the modules.” (citing go mod help why). But it outputs those paths only if it goes all the way to the main module. The main module is identified by the go.mod file in your repository.

I’m not completely sure about this, but IIRC, if go mod why says the given package or module is not needed by the main module, that means its code will not end up in the binary. However, this is further complicated by build flags. For example, a module could contain a package that only gets included when compiling on Windows (or setting GOOS=windows). Additionally there can be test packages, which are only compiled into test binaries.

I hope that helps clarify the output you get from these various commands and to use them effectively in case you need to investigate a license finding in an ORT report.