dependabot / dependabot-core

🤖 Dependabot's core logic for creating update PRs.
https://docs.github.com/en/code-security/dependabot
MIT License
4.65k stars 1k forks source link

Drupal project dependabot runs are very slow #2999

Open driskell opened 3 years ago

driskell commented 3 years ago

Package manager/ecosystem Composer v2

Manifest contents prior to update

{
    "name": "drupal/recommended-project",
    "description": "Project template for Drupal 8 projects with a relocated document root",
    "type": "project",
    "license": "GPL-2.0-or-later",
    "homepage": "https://www.drupal.org/project/drupal",
    "support": {
        "docs": "https://www.drupal.org/docs/user_guide/en/index.html",
        "chat": "https://www.drupal.org/node/314178"
    },
    "repositories": [
        {
            "type": "composer",
            "url": "https://packages.drupal.org/8"
        }
    ],
    "require": {
        "composer/installers": "^1.2",
        "drupal/block_permissions": "1.0.0",
        "drupal/core-composer-scaffold": "^8.8",
        "drupal/core-project-message": "^8.8",
        "drupal/core-recommended": "^8.8"
    },
    "require-dev": {
        "drupal/core-dev": "^8.8"
    },
    "conflict": {
        "drupal/drupal": "*"
    },
    "minimum-stability": "dev",
    "prefer-stable": true,
    "config": {
        "sort-packages": true
    },
    "extra": {
        "drupal-scaffold": {
            "locations": {
                "web-root": "web/"
            }
        },
        "installer-paths": {
            "web/core": [
                "type:drupal-core"
            ],
            "web/libraries/{$name}": [
                "type:drupal-library"
            ],
            "web/modules/contrib/{$name}": [
                "type:drupal-module"
            ],
            "web/profiles/contrib/{$name}": [
                "type:drupal-profile"
            ],
            "web/themes/contrib/{$name}": [
                "type:drupal-theme"
            ],
            "drush/Commands/contrib/{$name}": [
                "type:drupal-drush"
            ],
            "web/modules/custom/{$name}": [
                "type:drupal-custom-module"
            ],
            "web/themes/custom/{$name}": [
                "type:drupal-custom-theme"
            ]
        },
        "drupal-core-project-message": {
            "include-keys": [
                "homepage",
                "support"
            ],
            "post-create-project-cmd-message": [
                "<bg=blue;fg=white>                                                         </>",
                "<bg=blue;fg=white>  Congratulations, you’ve installed the Drupal codebase  </>",
                "<bg=blue;fg=white>  from the drupal/recommended-project template!          </>",
                "<bg=blue;fg=white>                                                         </>",
                "",
                "<bg=yellow;fg=black>Next steps</>:",
                "  * Install the site: https://www.drupal.org/docs/8/install",
                "  * Read the user guide: https://www.drupal.org/docs/user_guide/en/index.html",
                "  * Get support: https://www.drupal.org/support",
                "  * Get involved with the Drupal community:",
                "      https://www.drupal.org/getting-involved",
                "  * Remove the plugin that prints this message:",
                "      composer remove drupal/core-project-message"
            ]
        }
    }
}

Updated dependency drupal/block_permissions 1.1.0 is available

What you expected to see, versus what you actually saw Fast check and update

Images of the diff or a link to the PR, issue or logs Actual result is extremely slow. It takes upwards of 30 seconds to process the update.

Across our main project where we have MANY Drupal modules, it easily takes the dependabot run to approximately 75 minutes. This is beyond the timeout set in GitHub so we can't use GitHub to do this currently due to issue #2416

In #2416 I started some investigations and came to the below conclusions. Some are not fixable by dependabot but I wanted to raise them here so the team is aware and could follow the issues / PR at the composer side if they wanted to.

  1. -If a Drupal module depends on drupal/core in a large range, such as ^8 || ^9, it can cause composer to add all versions of drupal/core to its pool of packages that it uses to resolve updates. It causes hundreds of thousands of conflict rules to need to be generated and processed. Although processing is fast, generating them is not. It adds many seconds to a run.- Update: Fixed in 2.2 now that composer/composer#9261 and composer/composer#9620 are merged.
  2. drupal/core has many replace entries for core modules. Because it's rarely a root requirement and is brought in by drupal/core-recommended, during an update when the module that needs updated depends on drupal/core it unlocks it, and then begins to attempt to download package information for it's replaces. This generates a lot of network activity, mostly 404s because these packages do not exist. Update: Waiting on composer/composer#9619 but it is highly complex.
  3. Packages are actually downloaded and installed once the update is processed by dependabot, which seems largely unnecessary as they don't really impact the update run. Composer plugin packages might as they could hook into events etc. but not general libraries. So unnecessary bandwidth usage too. Update: I raised a PR at #2998 but abandoned it originally as https://github.com/wikimedia/composer-merge-plugin wasn't working, but I did realise this doesn't work anyway. So I have raised a new PR at #4635 for discussion.

I experimented with trying to get caching to work better but the modifications to composer are just not really acceptable to submit to them (in my opinion) so I dug further to work out real solutions. There's already a PR that helps with installs, composer/composer#9261, which hopefully hits Composer 2.1 but in my testing did not have a major impact. So I raised two new PR with composer/composer#9619 and composer/composer#9620 for discussion that significantly speed up the processing of updates by targeting items 1 and 2 above directly.

Overall got my full Drupal run time with all the above to 5 minutes.

markdorison commented 3 years ago

We are experiencing this issue on a couple of our Drupal codebases. I am monitoring all of the referenced issues and pull requests and I will add any additional clues that we uncover.

markdorison commented 2 years ago

We tested this again now that the below PRs have been merged into Composer but our Dependabot run is still timing out.

We attempted to ensure that Dependabot was using the latest version of Composer by requiring composer/composer:^2.2.3 in our project, but I am not 100% sure Dependabot will respect that.

jurre commented 2 years ago

I am not 100% sure Dependabot will respect that.

It won't, dependabot is on 2.1.14 of composer. There is an effort being made to upgrade though, you can follow along here: https://github.com/dependabot/dependabot-core/pull/4586.

jurre commented 2 years ago

I've rolled out a version of Dependabot that uses composer 2.2.4 for updates, have you seen any improvement in the time it takes for updates to complete for your projects @markdorison?

markdorison commented 2 years ago

We have two projects where this is occurring. Sadly, they are both still timing out. Is there any info from the Dependabot "update logs" I could provide that would be helpful?

jurre commented 2 years ago

I don't think so unfortunately, I think the problem is that we're configuring composer to download all the dependencies and we should just not. But the last time I tried doing that it broke a couple of our test scenarios and we haven't been able to prioritize digging into it deeper.

I'll make sure to raise it internally and see if we can get some performance work planned soon.

One thing that actually would help a bunch is a public test project that we can reproduce this against

markdorison commented 2 years ago

@jurre I created a public repo that reproduces the issue for us.

driskell commented 2 years ago

Worth noting that composer/composer#9619 is still outstanding and is a peculiar complex one - my last proposal was to not just delete the 1 line but every line in that block. All tests still pass. But it's a major enigma as it's a risky change and it's unclear if the code is there to do something that a test doesn't exist for... That PR will drop the time to process even further as it removes a significant amount of 404 requests to the Drupal repositories for replaced packages.

Just wanted to drop in here that I'm still monitoring that PR and hoping it will move but things take time. Definitely though it'll be super worth optimising elsewhere in the meantime if there's other places that can be optimised. Then if and when composer/composer#9619 lands it should mean extremely fast updates in comparison to when this ticket first opened! Definitely it's way faster in 2.2 though but yeh hopefully more to come.

driskell commented 2 years ago

@jurre I updated #2998 and took the setInstall(false) approach you recommended and got it working so all expected tests pass. There is some discussion to be had though as I think by not installing packages some dependabot code that turns install failures into exceptions is likely not necessary anymore as installs no longer happen.

Once that bit is done we are just waiting on composer/composer#9619 then and from my tests a Drupal run with lots of modules will complete in minutes.

driskell commented 2 years ago

I'm going to rest a bit from this - testing again with default I realise that the Wikimedia merge plugin issue is present even with dependabot currently without any changes, because of setRunScripts(false) - so perhaps the original PR was suitable as it was...

I will restore my branch to what it was and await a response here before I continue further into a rabbit hole!

driskell commented 2 years ago

I raised it again in #4635 for fresh discussion and cleaned it all up.

jurre commented 2 years ago

@driskell your changes should be live, have you noticed an improvement in performance on your projects? From our metrics I'm not seeing a big change, but there might be some outliers that I'm missing (which I can dig into separately using p50/p99 metrics etc, but good to get some anecdata also)

markdorison commented 2 years ago

Our affected projects (including the public repo) are still failing to complete.

jeffwidman commented 2 years ago

Per @driskell over in https://github.com/dependabot/dependabot-core/issues/2416#issuecomment-1246465434:

Drupal now removed the replaced declaration in 9.4.4 so I expect it to run much quicker as no 404 requests clogging the network per package. But I haven’t tested it.

So can this particular issue which is scoped to Drupal be closed?

I don't want this to become a catch-all issue for "composer is slow"... Unfortunately catch-call issues aren't really helpful because while the symptoms may be similar, the root cause may be entirely different.

@markdorison I really appreciate the hard work you did to create a public repo that repro's the issue in https://github.com/dependabot/dependabot-core/issues/2999#issuecomment-1013139098. I don't have time to look at it now, but hopefully we can get to that at some point... but if you're still seeing slowness even after bumping Drupal to 9.4.4 perhaps we should split your problem out as a separate issue?

markdorison commented 1 year ago

I updated the public example repo to require Drupal 9.4.4 and I am still seeing Dependabot errors.

CleanShot 2022-10-31 at 11 27 28@2x
jeffwidman commented 1 year ago

@markdorison thanks for circling back and spending time to put together the example public example. Since it's public, I was able to pull the job runs and it looks like it timed out a bunch of previous days, but the last two days it completed the job, taking ~34 mins the last time.

My guess is that you aren't seeing the job logs because there's 203,980 lines in this job log! 21MB!!

It hit a bunch of update_not_possible errors, which typically indicates the dependency has an updated version available, but that other requirements in the manifest file (both direct and indirect deps) specify requirements that conflict with the newer version... So it's impossible to bump to the newer version. This can often happen if several deps have to be bumped in lock-step, as Dependabot isn't smart enough for most ecosystems to bump multiple things at once.

Looking at the logs, I see 50 update_not_possible errors, so that's probably why things are slow... Because each time it's finding a new version of a dep, then making all the calls to walk the indirect dep tree only to get to the end and realize something conflicts and prevents the update. Versus an up-to-date dep it skips walking the dep tree once it realizes it's up to date. Sometimes Dependabot can complete all 50 in time, sometimes not. But even if we raised the timeout, that wouldn't solve the underlying problem of non-resolvable updates.

I wonder... if you updated a couple some of those manually such that there's only a handful of outdated deps which don't have to be raised in lock-step, would it then start working as expected, and much faster as well?

Given that you can't see the job logs, I'm checking if the job logs I can access are okay to post publicly (since this is a public repo)...

As a side note, if the logs are too big to display, it'd sure be nice if we returned a "the logs are too long" error rather than "unknown error"...

driskell commented 1 year ago

@jeffwidman

Looking at the logs, I see 50 update_not_possible errors, so that's probably why things are slow... Because each time it's finding a new version of a dep, then making all the calls to walk the indirect dep tree only to get to the end and realize something conflicts and prevents the update. Versus an up-to-date dep it skips walking the dep tree once it realizes it's up to date. Sometimes Dependabot can complete all 50 in time, sometimes not. But even if we raised the timeout, that wouldn't solve the underlying problem of non-resolvable updates.

Just to check - are you referring to Dependabot doing a walk or Composer? As I understood it, Dependabot runs Composer and Composer's Solver will decide if an update is possible or not. So the effort needed to improve this needs to be in Composer. It'll be worth someone taking the public repository and timing a composer update against individual packages to see what the slow down is.

markdorison commented 1 year ago

@driskell: Thanks for weighing in! You asked @jeffwidman, not me, but I will throw my .02 cents in here anyway. Feel free to take it with a grain of salt!

As I understood it, Dependabot runs Composer and Composer's Solver will decide if an update is possible or not. So the effort needed to improve this needs to be in Composer

Could this be a "yes and.."? I am following along on potential Composer performance improvements PRs (such as this one you created) and hoping something like that will make an impact. I want to note, though, that on the repos where we hit this issue, we are successfully using an alternative to Dependabot, which is working as expected. Obviously, I don't know how their approach might differ from Dependabot's.

Last but not least, any opportunity for a more descriptive error (as @jeffwidman noted), might give us a fighting chance to work around this issue in the meantime so I just wanted to give that a vigorous thumbs up! 😃

markdorison commented 1 year ago

It hit a bunch of update_not_possible errors, which typically indicates the dependency has an updated version available, but that other requirements in the manifest file (both direct and indirect deps) specify requirements that conflict with the newer version... So it's impossible to bump to the newer version. This can often happen if several deps have to be bumped in lock-step, as Dependabot isn't smart enough for most ecosystems to bump multiple things at once.

Looking at the logs, I see 50 update_not_possible errors, so that's probably why things are slow... Because each time it's finding a new version of a dep, then making all the calls to walk the indirect dep tree only to get to the end and realize something conflicts and prevents the update.

I've been thinking about this more, as I just discovered a third Drupal repo of ours that is now experiencing the timeout/error issue (not counting the public example).

When using a large project like Drupal, it seems nearly inevitable that there are going to be a large number of second-level dependencies that are update_not_possible. For example, the current release of Drupal has many Symfony dependencies (some of which are not the latest version of Symfony), which in turn have their own dependencies.

jeffwidman commented 1 year ago

@markdorison have you tried running the dry-run script locally within a docker container? That will probably provide you access to the logs that are getting hidden on cloud because they're too big...

markdorison commented 1 year ago

@jeffwidman Thanks for pointing me to the docs to run it locally; I had not tried that.

When running locally on two repos that have been unable to use Dependabot, it completed seemingly without issue:

🌍 Total requests made: '112'
🌍 Total requests made: '107'

When I check the UI for these repositories, one shows that Dependabot encountered an error, and the other (interestingly) does not, but when I click the "last checked" link on either of them to see the logs, both show me a Dependabot error.

CleanShot 2023-02-02 at 13 16 21@2x
jeffwidman commented 1 year ago

I suspect the one that didn't have an error on the run, but shows an error when trying to see the logs is just the error of too many logs to fetch.

For the other one that shows that Dependabot encountered an error, can you please try one more thing? This CLI tool includes the proxy we use in production, so more closely simulates production than the dry-run script: https://github.com/dependabot/cli/#dependabot-update

dry-run is normally enough, but occasionally we need proxy to figure out what's happening... and it's on my todo list to cleanup the docs on debugging, but we'll get there...

gapple commented 1 year ago

I no longer have access to the repository where I was having this issue, but dependabot provided full logs that I was able to scan through at one point.

One thing that was odd, is that when checking a package for available updates it would make multiple requests to packagist and the Drupal packages endpoint for the same paths - 4 if I recall correctly, and all but the first would return a 304 Not Modified response. This would also occur on packages that were on the latest version and had no available updates, even though it seems like it should be possible to determine that from the first response from the package repository and skip fetching additional metadata for other packages.

Perhaps to avoid timing out with no useful result, if dependabot could randomize the order of packages to be checked and limit the number on each run, then it would eventually check all packages even if it can't get to all of them in a single run.

markdorison commented 1 year ago

@jeffwidman After your last message, I remembered that I still have this public demo repo that Dependabot fails on. On that repo, the logs will load for me in the UI.

In any case, I went ahead and ran the CLI tool on that repo. I ran it twice. Both runs took approximately 3h42m. The first run ended with an error summary, but I am not sure where to identify the actual error itself. The second one was completed with no errors.

Run 1

INFO <job_cli> Finished job processing
INFO Results:
Dependabot encountered '1' error(s) during execution, please check the logs for more details.
time="2023-02-03T21:03:26Z" level=info msg="122446/124056 calls cached (98%)"

Run 2

INFO <job_cli> Finished job processing
time="2023-02-06T18:17:44Z" level=info msg="124969/126567 calls cached (98%)"

The full logs are huge (I understand why the UI is choking on them), but I managed to get then uploaded here.

markdorison commented 1 year ago

Optimize PoolBuilder to not load replaced targets if not required was committed to Composer within the last 24 hours. I am hoping this may have some effect on this issue. I am excited to test things again once this makes it into a Dependabot release!

driskell commented 1 year ago

For a simple project of Drupal 9.2.* and a single module, I put timings in https://github.com/composer/composer/pull/9619#issuecomment-1534607121 based on what the OP was. It shows a good reduction - going from about 7-8s to 3-4s. For 9.5.* with the same module it already is faster at around 5s but again reduces to 3s since even though the newer Drupal versions don't have replace anymore, the older versions are encountered and it still made the 404s before the patch.

So things should improve when the next composer version is released and dependabot released with it.

I did do some timing checks of Dependabot on a large project though and I'll be honest this is only going to be a marginal improvement for a large project. For a smaller project, the differences are more stark. Of course, it all comes down to latency on requests. With high request latency the 404s would be causing a big increase in timings.

Small project of Drupal 9.2.* with single module, running Dependabot. With composer 2.5.5:

bin/dry-run.rb composer driskell/dependabot-test --dep drupal/media_entity_twitter
real    0m23.939s
real    0m26.277s
real    0m25.132s

And after updating composer:

bin/dry-run.rb composer driskell/dependabot-test --dep drupal/media_entity_twitter
real    0m16.173s
real    0m14.871s
real    0m15.549s

For a very large project of my own with many modules running 9.5.* the improvement is much less. I only requested an check of a single package here (the same package the small project was using - drupal/media_entity_browser). With composer 2.5.5:

real    0m33.771s
real    0m32.313s
real    0m33.040s

And with newer composer:

real    0m30.794s
real    0m31.357s
real    0m31.009s

Caveat here is these are not super perfect tests since I am literally just running things a few times and taking the last few results... But it's something! I do want to dig into the timings at some point though, as I do know there's a fair few composer plugins in my large project that do not participate in the run and so some packages they virtually add to the composer install end up triggering removals and a bigger lock diff - but that's a separate issue...

kevinquillen commented 1 year ago

I landed here after seeing Dependabot quit running for all of my projects in March. I now get the timeout errors others have reported with Dependabot. Before, Dependabot ran without fail since 2020 using the same template in all projects.

Renovate has no issue and can do this in under two minutes - how can I get Dependabot up and running again?

kevinquillen commented 1 year ago

Is there anything that can be done here at all? Dependabot is dead now for any Drupal project and does not complete.

abdulapopoola commented 7 months ago

@kevinquillen ; so sorry about your experience. Would you be willing to try out the new actions program? That might help with the time outs

PapaGrande commented 6 months ago

so sorry about your experience. Would you be willing to try out the new actions program? That might help with the time outs

@abdulapopoola Can you clarify what to try out with links?

I've run into similar problems where Dependabot is too slow and times out with sites using many explicit composer dependencies.

abdulapopoola commented 6 months ago

@PapaGrande ; we have a beta program to switch the compute host Dependabot runs on. It's helped a few folks overcome their timeout issues. See this thread for example

drubb commented 4 months ago

@kevinquillen Had the same problems in a Drupal project. Turned out you need to add a PHP platform requirement to composer.json, otherwise the jobs will run on PHP 7.4 and Composer tries to find compatible packages for this version. So adding "php": "^8.1" to the requirements fixed it for me. Updates take still lots of time, but at least don't run into a timeout anymore.