oss-review-toolkit / ort

A suite of tools to automate software compliance checks.
https://oss-review-toolkit.org
Apache License 2.0
1.57k stars 308 forks source link

GoMod: Dependencies with tags containing the module name and the version are not resolved correctly #6626

Closed malmor closed 1 year ago

malmor commented 1 year ago

Hey, we have and issue with the GoMod Package Manager during the scan phase.

Could not resolve provenance for package 'Go::github.com/aws/aws-sdk-go-v2/config:1.18.3':
IOException: Could not resolve provenance for 'Go::github.com/aws/aws-sdk-go-v2/config:1.18.3' for source code origins [VCS, ARTIFACT].
Resolution of VCS failed with: IOException: Could not resolve revision for package 'Go::github.com/aws/aws-sdk-go-v2/config:1.18.3' with VcsInfo(type=Git, url=https://github.com/aws/aws-sdk-go-v2.git, revision=v1.18.3, path=config):
Discarding revision 'v1.18.3' because the requested VCS path 'config' does not exist.

Our setup

Our project has a dependency on the module config (and some more) from aws-sdk-go-v2:

go.mod ``` module my.company go 1.19 require ( github.com/aws/aws-lambda-go v1.35.0 github.com/aws/aws-sdk-go v1.44.150 github.com/aws/aws-sdk-go-v2 v1.17.1 github.com/aws/aws-sdk-go-v2/config v1.18.3 github.com/aws/aws-sdk-go-v2/feature/dynamodb/attributevalue v1.10.6 github.com/aws/aws-sdk-go-v2/service/dynamodb v1.17.7 github.com/aws/aws-sdk-go-v2/service/s3 v1.29.4 github.com/google/go-cmp v0.5.8 golang.org/x/exp v0.0.0-20230105202349-8879d0199aa3 ) require github.com/patrickmn/go-cache v2.1.0+incompatible // indirect require ( github.com/aws/aws-sdk-go-v2/aws/protocol/eventstream v1.4.9 // indirect github.com/aws/aws-sdk-go-v2/credentials v1.13.3 // indirect github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.12.19 // indirect github.com/aws/aws-sdk-go-v2/internal/configsources v1.1.25 // indirect github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.4.19 // indirect github.com/aws/aws-sdk-go-v2/internal/ini v1.3.26 // indirect github.com/aws/aws-sdk-go-v2/internal/v4a v1.0.16 // indirect github.com/aws/aws-sdk-go-v2/service/dynamodbstreams v1.13.26 // indirect github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding v1.9.10 // indirect github.com/aws/aws-sdk-go-v2/service/internal/checksum v1.1.20 // indirect github.com/aws/aws-sdk-go-v2/service/internal/endpoint-discovery v1.7.19 // indirect github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.9.19 // indirect github.com/aws/aws-sdk-go-v2/service/internal/s3shared v1.13.19 // indirect github.com/aws/aws-sdk-go-v2/service/sso v1.11.25 // indirect github.com/aws/aws-sdk-go-v2/service/ssooidc v1.13.8 // indirect github.com/aws/aws-sdk-go-v2/service/sts v1.17.5 // indirect github.com/aws/smithy-go v1.13.4 // indirect github.com/jmespath/go-jmespath v0.4.0 // indirect github.com/satori/go.uuid v1.2.1-0.20181028125025-b2ce2384e17b // indirect ) ```

ORT successfully detects the package, but when running the scan phase it does not use the correct Git tag to download the source code from. The analyze phase produces the following analyzer-result (removed unrelated information):

---
analyzer:
  result:
    packages:
    - id: "Go::github.com/aws/aws-sdk-go-v2/config:1.18.3"
      purl: "pkg:golang/github.com%2Faws%2Faws-sdk-go-v2%2Fconfig@1.18.3"
      declared_licenses: []
      declared_licenses_processed: {}
      description: ""
      homepage_url: ""
      binary_artifact:
        url: ""
        hash:
          value: ""
          algorithm: ""
      source_artifact:
        url: ""
        hash:
          value: ""
          algorithm: ""
      vcs:
        type: "Git"
        url: "https://github.com/aws/aws-sdk-go-v2.git"
        revision: "v1.18.3"
        path: "config"
      vcs_processed:
        type: "Git"
        url: "https://github.com/aws/aws-sdk-go-v2.git"
        revision: "v1.18.3"
        path: "config"

One can see that type, url and path are correct - but the revision is not. Because this module is part of a multi-module repository, the revision should be the combination of path and version (see below).

The scan phase finds multiple Git tags that contain the version part - and throws an IOException here:

14:27:54.755 [DefaultDispatcher-worker-14] INFO  org.ossreviewtoolkit.downloader.VersionControlSystem - No Git revision for package 'github.com/aws/aws-sdk-go-v2/config' and version '1.18.3' found: IOException: Multiple matching tags found for version '1.18.3': [service/databasemigrationservice/v1.18.3, service/cognitoidentityprovider/v1.18.3, service/kms/v1.18.3, service/fms/v1.18.3, service/storagegateway/v1.18.3, service/ecs/v1.18.3, service/docdb/v1.18.3, service/robomaker/v1.18.3, service/snowball/v1.18.3, service/sqs/v1.18.3, service/efs/v1.18.3, service/workspaces/v1.18.3, service/athena/v1.18.3, service/wellarchitected/v1.18.3, service/dataexchange/v1.18.3, service/location/v1.18.3, service/sts/v1.18.3, service/dynamodb/v1.18.3, service/elasticsearchservice/v1.18.3, service/ecr/v1.18.3, service/workmail/v1.18.3, service/datasync/v1.18.3, service/transfer/v1.18.3, service/cloudfront/v1.18.3, service/rds/v1.18.3, service/outposts/v1.18.3, service/wafv2/v1.18.3, service/sagemakerruntime/v1.18.3, service/batch/v1.18.3, service/networkfirewall/v1.18.3, service/comprehend/v1.18.3, service/lightsail/v1.18.3, service/iotwireless/v1.18.3, service/pricing/v1.18.3, config/v1.18.3, service/ivs/v1.18.3, service/pinpoint/v1.18.3, service/redshiftdata/v1.18.3, service/eventbridge/v1.18.3, service/directconnect/v1.18.3, service/elasticloadbalancingv2/v1.18.3, service/rekognition/v1.18.3, service/shield/v1.18.3, service/customerprofiles/v1.18.3, service/mediaconnect/v1.18.3, service/auditmanager/v1.18.3, service/devopsguru/v1.18.3, service/neptune/v1.18.3, service/servicediscovery/v1.18.3, service/appstream/v1.18.3, service/cloudwatch/v1.18.3, service/iam/v1.18.3, service/lookoutmetrics/v1.18.3, service/secretsmanager/v1.18.3, service/sns/v1.18.3, service/detective/v1.18.3, service/costexplorer/v1.18.3, service/organizations/v1.18.3]. Please add a curation.

The correct Git tag config/v1.18.3 is there - but ORT does not use it.

Go modules with name and version

For Go modules that are part of a multi-module repository it is common to use Git tags that contain the module name and the semantic version: https://github.com/golang/go/wiki/Modules#publishing-a-release

In this example the Git tag that should be used for this version is config/v1.18.3: https://github.com/aws/aws-sdk-go-v2/releases/tag/config%2Fv1.18.3

Possible workaround

One way to fix this issue is by creating curations for the vcs:revision part:

curations:
  packages:
  - id: "Go::github.com/aws/aws-sdk-go-v2/config:1.18.3"
    curations:
      comment: "Fix revision because the package is part of a multi-module repository"
      vcs:
        revision: "config/v1.18.3"

The downside of this approach is that one needs to create such a curation for every single version. This results in a huge amount of manual work because we update dependencies very often.

And using a single curation for all versions (like id: "Go::github.com/aws/aws-sdk-go-v2/config) does not seem to be possible because the version needs to be part of the revision value. (?)

Question

Would it be possible to implement native support for these kind of packages in ORT? As far as I can tell the necessary information should be available after the analyze phase. This would allow us to include these packages in our NOTICE files.

Kind Regards, Malte

sschuberth commented 1 year ago

Our filterVersionNames function should already support finding the "config/v1.18.3" tag as a revision candidate for a project called "config". Maybe we simply forgot to pass along the project name to our guessing logic?

sschuberth commented 1 year ago

Which revision of ORT are you using @malmor?

malmor commented 1 year ago

Which revision of ORT are you using @malmor?

Sorry, forgot to put that information into the issue description. 🤦‍♂️

We are running revision 37d7e05e98363793e024081bcd31ce4777364d4f - which is three weeks old.

sschuberth commented 1 year ago

which is three weeks old.

There have been quote bunch of esp. GoMod-related changes since then. Please first try again with the latest version from today.

malmor commented 1 year ago

Thanks for the hint!

I just ran ORT again using revision d949be19c1c99f8abd07d30bbe5d2ce480744bed (release yesterday evening) - but this results in an analysis error:

Analyzing project path:
    /builds/my/project-example
Found 1 GoMod definition file(s) at:
    go.mod
Found 1 definition file(s) from 1 package manager(s) in total.
14:22:13.302 [DefaultDispatcher-worker-2] ERROR org.ossreviewtoolkit.analyzer.PackageManager - Resolving GoMod dependencies for path 'go.mod' failed with: MissingKotlinParameterException: Instantiation of [simple type, class org.ossreviewtoolkit.analyzer.managers.ModuleInfo] value failed for JSON property GoMod due to missing (therefore NULL) value for creator parameter goMod which is a non-nullable type
 at [Source: (String)"{
\u0009"Path": "my.company/my/project-example",
\u0009"Main": true,
\u0009"Dir": "/builds/my/project-example",
\u0009"GoMod": "/builds/my/project-example/go.mod",
\u0009"GoVersion": "1.19"
}
{
\u0009"Path": "github.com/BurntSushi/toml",
\u0009"Version": "v0.3.1",
\u0009"Time": "2018-08-15T10:47:33Z",
\u0009"Indirect": true,
\u0009"GoMod": "/tmp/ort-GoMod4423785630482469950/pkg/mod/cache/download/github.com/!burnt!sushi/toml/@v/v0.3.1.mod"
}
{
\u0009"Path": "github.com/aws/aws-lambda-go","[truncated 15634 chars]; line: 269, column: 1] (through reference chain: org.ossreviewtoolkit.analyzer.managers.ModuleInfo["GoMod"])
Writing analyzer result to '/builds/my/project-example/output/analyzer-result.yml'.
The analysis took 1m 52.467218401s.
Found 1 project(s) and 0 package(s) in total (not counting excluded ones).
Applied 0 curation(s) from 3 provider(s).
Resolved issues: 0 errors, 0 warnings, 0 hints.
Unresolved issues: 1 error, 0 warnings, 0 hints.

This is the diff between the two revisions. Do you want me to open a separate issue for this problem?

sschuberth commented 1 year ago

Do you want me to open a separate issue for this problem?

No, this now sounds to me like a duplicate of https://github.com/oss-review-toolkit/ort/issues/6615 which @fviernau is already looking into.

fviernau commented 1 year ago

Looking at the produced analyzer result in the issues description, I see that it contains:

revision: "v1.18.3"

I haven't verified but am certain that this revision won't be output by latest HEAD of master ORT version anymore, as that's exactly (one part of) what my recent GoMod improvements were about. It should now always output a SHA1, never a tag name so the problem should be gone. As part of these changes I introduced #6615, but in my view that's something different.

The problem illustrated here should be fixed, so closing this as fixed. @malmor it'd be great if you could verify this with latest ORT.

fviernau commented 1 year ago

@malmor I agree that https://github.com/oss-review-toolkit/ort/issues/6626#issuecomment-1458286182 is actually #6615, would you be able to share the files go.mod + main.go for reproducing this under #6615?

malmor commented 1 year ago

Hey @fviernau, fantastic to hear - looking forward to testing this with the latest ORT version after #6615 is fixed!

I will join the discussion in #6615 and try to provide our example - just need to refactor it first because it contains some internal dependencies hosted in our own infrastructure.

Thanks!

fviernau commented 1 year ago

I will join the discussion in #6615 and try to provide our example - just need to refactor it first because it contains some internal dependencies hosted in our own infrastructure.

Great, thanks! Note that for the file you'd provide for reproducing the go.mod file ideally should be consistent with go mod tidy e.g. unused dependencies stripped. (which is why I also asked for main.go.