etcd-io / etcd

Distributed reliable key-value store for the most critical data of a distributed system
https://etcd.io
Apache License 2.0
47.89k stars 9.78k forks source link

Improve dependency management #18925

Open ivanvc opened 2 days ago

ivanvc commented 2 days ago

What would you like to be added?

The current dependency management process involves a lot of manual intervention. There have been some attempts at improving it, including updating the update_deps.sh script, trying out different dependabot configurations, and evaluating other tools besides dependabot.

I want to formalize the work and have a central place to discuss alternatives, pros, and cons.

Possible approaches

  1. Do nothing / keep dependency management as it is right now. As stated before, the current process requires a lot of manual intervention, and the toil is high. The time to create the weekly PR and merge it keeps sliding into later in the week. There are times in which by human error, the commits are not properly bumping the dependencies.
  2. Improve dependabot configuration (so far, I've tested the following options):
    1. Use the new directories property: The result of this is multiple pull requests for the same dependency across all of the submodules where it is present. Although we (@jmhbnz and I) thought this was going to be the silver bullet, in the end, it doesn't work as we expected. Enabling this will just add noise to our repository, as we still need to create a PR to bump dependencies manually. See the following example of go.opentelemetry.io/otel/trace in my fork: https://github.com/ivanvc/etcd/pull/410 https://github.com/ivanvc/etcd/pull/406 https://github.com/ivanvc/etcd/pull/390 https://github.com/ivanvc/etcd/pull/386.
    2. Use the directories property and groups. Although this looked promising, as it creates a single pull request with all the weekly dependencies, the issues that I find are: 1. it creates a single commit for the whole group (rather than one per dependency bumped), 2. it doesn't run go mod tidy, and it breaks our CI. See an example in my fork: https://github.com/ivanvc/etcd/pull/265 (in the files changed and any go.sum file, it's clear that it is not tidying them).
  3. Replace dependabot with renovatebot. I noticed that several projects on board for dependabot's directories beta feature moved to renovate. The immediate issue is that we would need a new GitHub application installed for our GitHub organization. This is not the silver bullet but may help decrease the effort. I tested some configurations in my fork:
    1. One pull request per dependency bump: This would be similar to what we have right now, but it bumps the dependency across all submodules (and tides them). This will be a minimal improvement, as we can merge and close these PRs without the manual step to generate a grouped PR. Refer to an example in my fork: https://github.com/ivanvc/etcd/pull/435
    2. Grouped pull request: This doesn't work as expected. It's similar to dependabot's approach (single commit for all bumps), although the community seems interested in having multiple commits (https://github.com/renovatebot/renovate/issues/7288). An example of this configuration in my fork: https://github.com/ivanvc/etcd/pull/442
  4. A wild idea would be to write our own automation, but I think it would be less than ideal as we'd need to maintain another project.

I lean towards 3i (renovate + one pull request per dependency), which would be one small step to improve the process. But I want to open the topic for discussion.

cc. @ArkaSaha30

Why is this needed?

To decrease the toil of keeping dependencies up to date.

jmhbnz commented 2 days ago

Thanks for all your work experimenting and writing up the findings @ivanvc!

I am in favor of moving away from needing to manually raise a pull request every week to bump dependencies to instead be able to merge automated pr's raised by renovate, so option 3.

This will be less toil for the community to manage, and one less bespoke project specific process for new contributors to learn.

ahrtr commented 2 days ago

I lean towards 3i (renovate + one pull request per dependency), which would be one small step to improve the process.

Questions:

ivanvc commented 1 day ago

Questions:

* Once we merge each PR, we also need to wait for other PRs to automatically rebase and green again? So we have to approve & merge the PRs one by one?

That's a good question. I think so, because they will be editing the same files. This may slow down the merging process of the whole week's dependencies. However, we can fine tune how often the bot opens the pull requests (i.e., you can specify how many per hour it should open).

* We still need to manually close PRs which bump pure indirect dependencies?

Yes, and you just made me realize that because of how we do the dependency management, we may not ever be able to do a single pull request (if they ever support multiple commits in a single one), because we would want to remove some dependencies that are purely indirect, but we want to keep the ones that are direct in some submodules but indirect in others.

jmhbnz commented 1 day ago

Yes, and you just made me realize that because of how we do the dependency management, we may not ever be able to do a single pull request (if they ever support multiple commits in a single one), because we would want to remove some dependencies that are purely indirect, but we want to keep the ones that are direct in some submodules but indirect in others.

One idea we could add all purely indirect dependencies to the renovate ignore list. Provided renovate would still bump them when a direct dependency requires it which I would think it would as it runs go mod tidy.

ahrtr commented 1 day ago

This may slow down the merging process of the whole week's dependencies. However, we can fine tune how often the bot opens the pull requests (i.e., you can specify how many per hour it should open).

OK. Since usually we just have several dependency bumping (i.e. around 3) each time, so it might be fine. Regarding the frequency, either Weekly or bi-weekly is OK to me.

One idea we could add all purely indirect dependencies to the renovate ignore list.

It will introduce extra maintenance effort, because the list may change over time.

ivanvc commented 1 day ago

It will introduce extra maintenance effort, because the list may change over time.

It may need some experimentation, but maybe through commands we could cherry pick some dependencies out of the grouped PR, following James' idea. I'll report back.