dependabot / dependabot-core

🤖 Dependabot's core logic for creating update PRs.
https://docs.github.com/en/code-security/dependabot
MIT License
4.74k stars 1.03k forks source link

Support for grouped update PRs that group across directories with same ecosystem #7547

Closed mx-psi closed 5 months ago

mx-psi commented 1 year ago

Is there an existing issue for this?

Feature description

On open-telemetry/opentelemetry-collector-contrib we have 200+ Go modules that depend on each other. We currently have a manual system to group all the updates of these modules into a single PR that bumps all of these dependencies (see e.g. open-telemetry/opentelemetry-collector-contrib/pull/24175), since the current configuration we use creates 80+ PRs every single week (see e.g. the PRs created yesterday, July 10th).

The recent grouped updates feature is a step in the direction that would make Dependabot more usable for our repository, but it only allows grouping updates per Go module. We would instead want to group all updates across all modules in a single PR, both for operational reasons (we don't have enough people to review all PRs) as well as to avoid manual changes (since some Go modules depend on other Go modules in the repository, bumping a dependency in one of them may imply bumping them also in others; making all updates in one go avoids having to do manual updates on Dependabot PRs).

Is this a supported feature of the grouped updates? If not, could it be considered as a feature request?

spencerschrock commented 1 year ago

+1 ossf/scorecard has a similar problem with both our Go and Docker ecosystems.

We have 7 Dockerfiles defined in our dependabot.yml, all of which depend on the same base image (golang:1.19 currently). When there's an update, we get 7 PRs, 1 for each Dockerfile. If these could be grouped that would be a nice QoL improvement. Here's an example of the 7 dependabot PRs that I closed and submitted a manual PR instead.

jmhbnz commented 1 year ago

+1 - The etcd project are currently resorting to manual pr's each week to group updates due to a multi module structure and a need to bump a dependency in all places in one pr. I believe a grouping across all modules in the project would resolve this.

abdulapopoola commented 1 year ago

To provide an update: we've started to investigate how to make this happen; tagging @Nishnha and @honeyankit

abdulapopoola commented 8 months ago

Some update: This should be heading to beta within a month or less. Tagging @jurre , @jakecoffman , and @Nishnha

jzabroski commented 8 months ago

Oh heck yeah!! Thank you for pinning this issue as well. I'd love to beta test this if you tell me how. I have tons of dependencies in my project and would make great use of groups.

jzabroski commented 7 months ago

@abdulapopoola How far off are we now from being able to use this? It has technically been a month :) Can't wait!

abdulapopoola commented 7 months ago

@jzabroski sorry about that; we ran into some issues. But this should be coming next week :)

abdulapopoola commented 7 months ago

Multi-directory support is here!; open to hearing your thoughts as you try out the beta!

Tagging @carlincherry

billinghamj commented 7 months ago

We've just had our first one merge, seems to work great :) Looking forward to our first updates to GCP terraform providers

ivanvc commented 7 months ago

Hi @abdulapopoola, what would be a good channel to provide feedback regarding this beta feature? I'm testing it out in a Go project with multiple modules (go.mod files), but it seems like it's not working as expected.

billinghamj commented 7 months ago

@abdulapopoola I do have a case where two PRs were created some time apart, but it seems like they should have been in one PR.

Both were updating a Terraform provider (mongodb/mongodbatlas, from 1.15.3 to 1.16.0). In two folders both included in the same directories list

It's a private repo but you're welcome to access anything on the repo or dependabot logs etc 👌

ex-nihil commented 7 months ago

I tested out the new directories together with groups.

Using the package-ecosystem: npm it results in the same dependency being checked multiple times. For instance, the jest dependency was requested a total of 76 times (before timeout):

GET https://registry.yarnpkg.com:443/jest/29.7.0

Regrettably, these redundant requests are causing timeouts for Dependabot, which is limited to a 45-minute timeframe.

EDIT: Pretty sure I'm running into #8008. EDIT2: Going back to declaring each directory in isolation completed all updates in ~5 min total.

spencerschrock commented 7 months ago

Tested directories with the 8 docker images in github.com/ossf/scorecard, and the first round of PRs only updated one of the docker images (/Dockerfile) and not the other 6 or 7.

distroless/base

Only updated /Dockerfile (https://github.com/ossf/scorecard/pull/4064), while there were 6 more occurrences in specified directories

golang

Again, only updated /Dockerfile (https://github.com/ossf/scorecard/pull/4063), while there were 7 more occurrences in specified directories

Looking at the log (https://github.com/ossf/scorecard/network/updates/821852346), it seems like it knows it needs to update more, but didnt include it in the PR?

updater | 2024/04/30 20:37:38 INFO <job_821852346> Finished job processing
updater | 2024/04/30 20:37:38 INFO Results:
updater | +-----------------------------------------------------------+
updater | |            Changes to Dependabot Pull Requests            |
updater | +---------+-------------------------------------------------+
updater | | created | golang ( from `c4fb952` to `d5302d4` )          |
updater | | created | distroless/base ( from `29da700` to `e238d40` ) |
updater | | created | golang ( from `c4fb952` to `d5302d4` )          |
updater | | created | golang ( from `c4fb952` to `d5302d4` )          |
updater | | created | distroless/base ( from `29da700` to `e238d40` ) |
updater | | created | golang ( from `c4fb952` to `d5302d4` )          |
updater | | created | distroless/base ( from `29da700` to `e238d40` ) |
updater | | created | golang ( from `c4fb952` to `d5302d4` )          |
updater | | created | distroless/base ( from `29da700` to `e238d40` ) |
updater | | created | distroless/base ( from `29da700` to `e238d40` ) |
updater | | created | golang ( from `c4fb952` to `d5302d4` )          |
updater | | created | golang ( from `c4fb952` to `d5302d4` )          |
updater | | created | distroless/base ( from `29da700` to `e238d40` ) |
updater | | created | golang ( from `c4fb952` to `d5302d4` )          |
updater | | created | distroless/base ( from `29da700` to `e238d40` ) |
updater | +---------+-------------------------------------------------+
billinghamj commented 7 months ago

Similar here actually - it definitely doesn't seem to be working for our Terraform directories. We've just had it open a couple of PRs making updates in 1 directory, when they should actually have updated 6

I have seen it work properly once with Go at least

jzabroski commented 7 months ago

OK, after properly configuring my dependabot.yml, I see a unique error in my logs that only seemed to start happening with the introduction of grouped updates. What's weird is that after I reverted my dependabot.yml, the audit logs changed in the Insights -> Dependency Graph -> Dependabot UI to suggest there was never any errors. It's almost like the audit logs are tightly coupled to a version of the dependabot.yml somehow.

See the logs here https://github.com/fluentmigrator/fluentmigrator/network/updates/821783403

It seems like AWSSDK.S3 was fetched as 3.7.0.4. AWSSDK.S3 is a transitive dependency of Snowflake.Data (C# db driver for Snowflake). My project does not directly reference AWSSDK.S3 anywhere.

proxy | 2024/04/30 16:40:42 [396] GET https://api.nuget.org:443/v3-flatcontainer/awssdk.s3/3.7.0.4/awssdk.s3.3.7.0.4.nupkg

and then later in the log, NuGet.Versioning.SemanticVersion.Parse cannot parse this dependency's version, presumably because it's not in semver format but legacy nuget format. What's interesting is none of my dependencies are directly against the AWSKSDK.S3. It is a transitive dependency of Snowflake.Data.

EDIT: I will try tonight a version of dependabot.yml that excludes SNowflake dependencies from dependabot config. Still seems like a bug with this feature.

Details

updater | 2024/04/30 16:40:57 ERROR Block argument of NuGetConfigCredentialHelpers::patch_nuget_config_for_action causes an exception Discovering build files in workspace [/home/dependabot/dependabot-updater/repo/src/FluentMigrator.DotNet.Cli]. updater | No dotnet-tools.json file found. updater | No global.json file found. updater | Discovering projects beneath [src/FluentMigrator.DotNet.Cli]. updater | No packages.config file found. updater | Unhandled exception: System.ArgumentException: '3.7.0.4' is not a valid version string. (Parameter 'value') updater | at NuGet.Versioning.SemanticVersion.Parse(String value) in /opt/nuget/lib/NuGet.Client/src/NuGet.Core/NuGet.Versioning/SemanticVersionFactory.cs:line 22 updater | at NuGetUpdater.Core.Discover.SdkProjectDiscovery.GetTransitiveDependencies(String repoRootPath, String projectPath, ImmutableArray`1 tfms, ImmutableArray`1 directDependencies, Logger logger) in /opt/nuget/lib/NuGetUpdater/NuGetUpdater.Core/Discover/SdkProjectDiscovery.cs:line 113 updater | at NuGetUpdater.Core.Discover.SdkProjectDiscovery.DiscoverAsync(String repoRootPath, String workspacePath, String projectPath, Logger logger) in /opt/nuget/lib/NuGetUpdater/NuGetUpdater.Core/Discover/SdkProjectDiscovery.cs:line 69 updater | at NuGetUpdater.Core.Discover.DiscoveryWorker.RunForProjectPathsAsync(String repoRootPath, String workspacePath, IEnumerable`1 projectPaths) in /opt/nuget/lib/NuGetUpdater/NuGetUpdater.Core/Discover/DiscoveryWorker.cs:line 160 updater | at NuGetUpdater.Core.Discover.DiscoveryWorker.RunForDirectoryAsnyc(String repoRootPath, String workspacePath) in /opt/nuget/lib/NuGetUpdater/NuGetUpdater.Core/Discover/DiscoveryWorker.cs:line 125 updater | at NuGetUpdater.Core.Discover.DiscoveryWorker.RunAsync(String repoRootPath, String workspacePath, String outputPath) in /opt/nuget/lib/NuGetUpdater/NuGetUpdater.Core/Discover/DiscoveryWorker.cs:line 59 updater | at NuGetUpdater.Cli.Commands.DiscoverCommand.<>c.<b__4_0>d.MoveNext() in /opt/nuget/lib/NuGetUpdater/NuGetUpdater.Cli/Commands/DiscoverCommand.cs:line 30 updater | --- End of stack trace from previous location --- updater | at System.CommandLine.Invocation.AnonymousCommandHandler.InvokeAsync(InvocationContext context) updater | at System.CommandLine.Invocation.InvocationPipeline.<>c__DisplayClass4_0.<b__0>d.MoveNext() updater | --- End of stack trace from previous location --- updater | at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c__DisplayClass17_0.<b__0>d.MoveNext() updater | --- End of stack trace from previous location --- updater | at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c__DisplayClass12_0.<b__0>d.MoveNext() updater | --- End of stack trace from previous location --- updater | at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c__DisplayClass22_0.<b__0>d.MoveNext() updater | --- End of stack trace from previous location --- updater | at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c__DisplayClass19_0.<b__0>d.MoveNext() updater | --- End of stack trace from previous location --- updater | at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c.<b__18_0>d.MoveNext() updater | --- End of stack trace from previous location --- updater | at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c__DisplayClass16_0.<b__0>d.MoveNext() updater | --- End of stack trace from previous location --- updater | at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c.<b__5_0>d.MoveNext() updater | --- End of stack trace from previous location --- updater | at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c__DisplayClass8_0.<b__0>d.MoveNext():

magnusjtvisma commented 7 months ago

For nodejs/npm this seems to work well. I'm having some problems with nuget though, I'm seeing this error for a lot of different packages:

updater | 2024/04/30 08:08:36 ERROR Error processing MySqlConnector (Dependabot::DependabotError) updater | 2024/04/30 08:08:36 ERROR FileUpdater failed updater | 2024/04/30 08:08:36 ERROR /home/dependabot/dependabot-updater/lib/dependabot/dependency_change_builder.rb:69:in `run'

I'm also experiencing timeouts on some larger repositories. Perhaps that's related to the issue mentioned by a comment above, or maybe it's just because the repo is huge.

carlincherry commented 7 months ago

Hi @abdulapopoola, what would be a good channel to provide feedback regarding this beta feature? I'm testing it out in a Go project with multiple modules (go.mod files), but it seems like it's not working as expected.

Hi @ivanvc thank you for raising this to the team; this is unexpected behavior and we've created a https://github.com/dependabot/dependabot-core/issues/9664 which our team will prioritize fixing soon.

carlincherry commented 7 months ago

Hi @billinghamj we've created a bug report for this issue which we'll prioritize fixing soon. Thank you!

carlincherry commented 7 months ago

@spencerschrock we added Docker to this bug report as well. Thank you!

carlincherry commented 7 months ago

Hi @jzabroski I'm sorry you experienced this behavior; this is a known issue; our team is actively investigating and we'll update here as we investigate.

carlincherry commented 7 months ago

updater | 2024/04/30 08:08:36 ERROR /home/dependabot/dependabot-updater/lib/dependabot/dependency_change_builder.rb:69:in `run'

Hi @magnusjtvisma can you please give us more details about the error you're seeing? We suspect that this is a nuget issue and would like more details about the error logging to confirm. Please file as much information as you can in an issue as a bug report. Thank you!

jzabroski commented 6 months ago

Hi @jzabroski I'm sorry you experienced this behavior; this is a known issue; our team is actively investigating and we'll update here as we investigate.

The problem is here: https://github.com/dependabot/dependabot-core/blob/da53246feba7084a36add3cef17970ebfe0017fd/nuget/helpers/lib/NuGetUpdater/NuGetUpdater.Core/Discover/SdkProjectDiscovery.cs#L117

The code is assuming semantic versioning all-the-way down. At any rate, given this bug seems to be AWSSDK.S3, and is a Top 100 nuget package with 787,013,240 total downloads, I expect others will definitely have this problem.

I believe you want this version: https://github.com/NuGet/NuGet.Client/blob/0f32917aaba18c2db765fc7ad5bc95ebf12ec58d/src/NuGet.Core/NuGet.Versioning/NuGetVersionFactory.cs#L36

If you see the documentation, NuGetVersion is a superset of SemanticVersion, so it will properly handle these edge cases: (https://github.com/NuGet/NuGet.Client/blob/0f32917aaba18c2db765fc7ad5bc95ebf12ec58d/src/NuGet.Core/NuGet.Versioning/NuGetVersion.cs#L9C1-L13C19

brettfo commented 6 months ago

Fix for the 4-part version is in #9689.

billinghamj commented 6 months ago

I have had a Dependabot PR work nicely with grouping + multiple directories today btw - with both Terraform and Go. Not sure if that's unexpected?

Screenshot 2024-05-08 at 18 04 33
sbueringer commented 6 months ago

@carlincherry Hi, not sure if this behavior is expected. We have the following situation:

dependabot.yaml

  groups:
    all-go-mod-patch-and-minor:
      patterns: [ "*" ]
      update-types: [ "patch", "minor" ]

(full config: https://github.com/kubernetes-sigs/cluster-api/blob/7b503211947bbf5b6220a66e417c49a5fd8705ab/.github/dependabot.yaml#L34)

Initially we had this PR open:

We intentionally didn't merge this PR because we had some other change we wanted to make first

Then we merged this PR (which also updated the dependabot.yaml):

This triggered the run of dependabot (https://github.com/kubernetes-sigs/cluster-api/network/updates/828643547)

This run ignored the existing PR and created new PRs, from the logs:

updater | 2024/05/15 14:46:16 INFO <job_828643547> Detected existing pull request for 'all-go-mod-patch-and-minor'.
updater | 2024/05/15 14:46:16 INFO <job_828643547> Deferring creation of a new pull request. The existing pull request will update in a separate job.
...

updater | 2024/05/15 14:48:36 INFO Results:
updater | +--------------------------------------------------------------------+
updater | |                Changes to Dependabot Pull Requests                 |
updater | +---------+----------------------------------------------------------+
updater | | created | k8s.io/apiextensions-apiserver ( from 0.30.0 to 0.30.1 ) |
updater | | created | k8s.io/cluster-bootstrap ( from 0.30.0 to 0.30.1 )       |
updater | | created | k8s.io/apimachinery ( from 0.30.0 to 0.30.1 )            |
updater | | created | k8s.io/client-go ( from 0.30.0 to 0.30.1 )               |
updater | | created | k8s.io/kubectl ( from 0.30.0 to 0.30.1 )                 |
updater | | created | k8s.io/api ( from 0.30.0 to 0.30.1 )                     |
updater | | created | k8s.io/apiserver ( from 0.30.0 to 0.30.1 )               |
updater | | created | k8s.io/component-base ( from 0.30.0 to 0.30.1 )          |
updater | | created | k8s.io/apiextensions-apiserver ( from 0.30.0 to 0.30.1 ) |
updater | | created | k8s.io/apimachinery ( from 0.30.0 to 0.30.1 )            |
updater | | created | k8s.io/api ( from 0.30.0 to 0.30.1 )                     |
updater | | created | k8s.io/client-go ( from 0.30.0 to 0.30.1 )               |
updater | +---------+----------------------------------------------------------+

Ideally this would just update the existing PR of the group instead of opening a new one. It was also surprising that dependabot then didn't create a second PR for the same group with the remaining dependencies. It created a single PR for each dependency instead.

P.S. I think afterwards once the old group PR went away dependabot closed the single PRs and opened a new group PR. But at this point it was all a bit hard to follow.

jzabroski commented 5 months ago

@carlincherry I think I'm a bit confused on how this feature is supposed to work now that the bugs blocking me seems to be addressed (thanks @brettfo).

As an example, npgql 8.0.1 had a security vulnerability and is referenced in two separate directories:

  1. src/FluentMigrator.Runner.Postgres
  2. test/

I'm only seeing a PR from dependabot as:

[fluentmigrator/fluentmigrator] Bump the third-party-dependencies group across 1 directory with 13 updates (PR #1812)

Is this the expected behavior? If so, it doesn't match how Visual Studio package manager works. In any event, if dependabot will support the forthcoming new MSBuild slx solution file format, then I probably don't need this feature right now and can use slnx to teach dependabot my dependencies tree and to update multiple references to the same version.

jzabroski commented 5 months ago

@carlincherry Actually, think I figured out what is happening:

[fluentmigrator/fluentmigrator] Bump the third-party-dependencies group across 1 directory with 15 updates (PR #1818)

  1. It is incorrectly reporting the number of directories as 1 when it is really 3. This gave me the impression it wasn't working but it actually mostly is working.
  2. For some reason, likely due to TargetFrameworks, it is not upgrading dependencies in "src/FluentMigrator.DotNet.Cli" - this executable only targets modern frameworks, whereas Fluentmigrator.Conslole targets net48. I'll try to look through the dependabot .NET code again to see why and revert back what I find.
Nishnha commented 5 months ago

Hi folks, this feature is now live! 🎉 https://github.blog/changelog/2024-06-25-simplified-dependabot-yml-configuration-with-multi-directory-key-directories-and-wildcard-glob-support/

If you encounter any issues with this please reach out to GitHub support.

marcofranssen commented 1 month ago

I have had a Dependabot PR work nicely with grouping + multiple directories today btw - with both Terraform and Go. Not sure if that's unexpected?

Screenshot 2024-05-08 at 18 04 33

@billinghamj would you mind sharing the terraform configuration? I'm looking into grouping all PRs for similar module updates. E.g. I have a repo with many modules that all depend on same upstream modules. Now I get a PR for each of my own modules, but I would love to have it grouped by upstream depedency so I get one PR that updates them in all my modules.