dask / community

For general discussion and community planning. Discussion issues welcome.
19 stars 3 forks source link

Security Backport Release #244

Closed quasiben closed 1 year ago

quasiben commented 2 years ago

I'd like to find out if there is support for security backports for previous releases of Dask. A CVE/security issue has only come up once (as far as I know) but it has presented some problems for RAPIDS (perhaps others)

  1. During releases, RAPIDS will pin to a given version of dask/distributed version . Mostly this is done for stability and stress testing at the end of a release cycle.
  2. RAPIDS can't pin within a release in a month (YYYY.MM.*) because there no API guarantees between releases.

Because of our strict pinning we are currently in a position where a RAPIDS 21.08 will trigger when organizations run CVE scans across our docker images. RAPIDS releases are expensive and rather than re-test everything with a new non-CVE version of Dask/Distributed, we could instead release a backport fix for dask/distributed.

If there is general support for this (hopefully rare occurrence ) we could do releases with the following structure: YYYY.MM.I(.J) . Note that in [YYYY.MM.I(.J)] J is implied as 0 according to PEP 440

For the particular RAPIDS problem above this would be porting a security fix for 2021.07.1.1

cc @martindurant @jakirkham @jrbourbeau @jsignell

jakirkham commented 2 years ago

cc @mrocklin

jacobtomlinson commented 2 years ago

I'm generally in favour of this.

For information our current scheme is YYYY.M.I. The leading zero in the month was causing issues in some places and PyPI/Conda strip it off anyway.

I'm fine with YYYY.M.I(.J) as long as it is also valid SemVer (which I think it is) because some things like the helm chart require valid SemVer versions.


To raise an alternative we could consider making API guarantees between months. When we moved away from SemVer we went to the other extreme end where our version scheme has no semantic meaning at all. Perhaps this issue signals that some amount of semantic meaning is valuable. We could be more similar to how projects like Ubuntu do versioning where they have YY.MM.P and the P is only ever bug/security fixes. This effectively makes the YY.MM portion a major semantic.

This adds some maintenance cognitive overhead, but perhaps reduces some pain for downstream users who want to pin to things like YYYY.M.*. I expect this would provide value to many users, especially those in enterprise environments.

It would also be interesting to chat to folks who provide commercial support to Dask. They are likely the folks who would do the work to backport bug fixes and it could be a way to give them a mechanism to more easily PR a backport. I am keen to keep things as they are in terms of the core OSS maintainers generally not caring about backporting though.

jsignell commented 2 years ago

We do still technically tag with a leading zero in the month, but you are totally right that it often gets stripped out.

To raise an alternative we could consider making API guarantees between months.

I am trying to think what this would look like in practice. It seems like we would have to either be merging to separate branches depending on whether or not it's a breaking change, or I guess we could delay merging any breaking changes until the month rolls over. Either way the cognitive overhead seems to be:

1) correctly identify whether the PR is a breaking change or not 2) know the correct policy for merging

Having written that out it doesn't seems too onerous, but it is definitely a shift from just merge everything to main anytime and the pain to the person who is responsible for merging the branches is probably nontrivial.

@quasiben's original suggestion does not introduce any additional work to our regular cycle so I am personally more excited about it. The only time there is more work is when there is a security issue. And that already seems like a lot of work (@jcrist).

hayesgb commented 2 years ago

This adds some maintenance cognitive overhead, but perhaps reduces some pain for downstream users who want to pin to things like YYYY.M.*. I expect this would provide value to many users, especially those in enterprise environments.

It would also be interesting to chat to folks who provide commercial support to Dask. They are likely the folks who would do the work to backport bug fixes and it could be a way to give them a mechanism to more easily PR a backport. I am keen to keep things as they are in terms of the core OSS maintainers generally not caring about backporting though.

I like this suggestion. We've been discussing this same question for users in an enterprise setting. Having a defined way to deal with this scenario would be a big benefit, since it's almost certain to come up eventually.

jacobtomlinson commented 2 years ago

Having written that out it doesn't seems too onerous, but it is definitely a shift from just merge everything to main anytime and the pain to the person who is responsible for merging the branches is probably nontrivial.

Yeah seeing it written out definitely gives me doubts. It would be a change that adds friction to every action, but given this is a rare event that is a large cost. As you say @quasiben's suggestion to add a bug/security semantic via another level of versioning feels like the lowest effort even if it's a little unusual.

I wonder if a good process for handling this is to just create single purpose backport branches and tag those when we need them. I think it's likely that the request for a backport will come in externally to the core maintainer group, but a maintainer will need to be involved to make it happen. So documenting what we would expect the workflow to look like might be useful. Something like:

This way anyone from the community can come in a contribute a backport. The merge process from the maintainer end will be a little more complex, but doesn't add any friction to the regular merge workflow.

jsignell commented 2 years ago

The merge process from the maintainer end will be a little more complex, but doesn't add any friction to the regular merge workflow.

I really like the idea of not changing the regular workflow and your proposal of the division of labor seems like the right track to me.

I'll just point out that we can always start with this more minimal approach and if at a later date there is a clear signal that an API guarantee would be valuable, we can pick up this discussion at that point.

jacobtomlinson commented 2 years ago

I'll just point out that we can always start with this more minimal approach and if at a later date there is a clear signal that an API guarantee would be valuable, we can pick up this discussion at that point.

Given that version numbers come up from time to time in the community I wonder if we should aim to have some questions about it in the next survey.

quasiben commented 2 years ago

I like the idea of reserving main as a low friction place to merge changes in. Does a maintainer need to create, then delete the additional branch ? Is it too risky to evaluate the PR and tag directly from that work ? Perhaps you are assuming the security backport PR will have merge conflicts with the targeted release ?

jacobtomlinson commented 2 years ago

The PR will be a branch on a fork, not the main repo. So it needs to be merged somewhere before we are able to tag it.

quasiben commented 2 years ago

Can we try the outline proposed by @jacobtomlinson :

  1. I'll create a backport with a security fix targeting main (as default)
  2. Another maintainer will temporarily create a branch (in this case 2021.07.1.X ?)
  3. We merge the PR into the branch, tag (then robots build the pypi/conda releases)
  4. Delete the branch

Should we do additional messaging here as a 5th step ? A note in the changelog ?

jakirkham commented 2 years ago

^ @mrocklin any thoughts on the proposal above? πŸ™‚

jakirkham commented 2 years ago

Friendly nudge @mrocklin πŸ˜‰

mrocklin commented 2 years ago

No thoughts from me

On Wed, May 18, 2022 at 3:11 AM jakirkham @.***> wrote:

Friendly nudge @mrocklin https://github.com/mrocklin πŸ˜‰

β€” Reply to this email directly, view it on GitHub https://github.com/dask/community/issues/244#issuecomment-1129705530, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACKZTDO7SEURNOHD4LLZNTVKSQZVANCNFSM5VFZS5PQ . You are receiving this because you were mentioned.Message ID: @.***>

jacobtomlinson commented 2 years ago

In that case, perhaps @hayesgb could weigh in on the proposal from a Coiled perspective as you already engaged here previously?