openssl / project

Tracking of project related issues
2 stars 1 forks source link

Review our CLA (CCLA and ICLA) requirements and policies #424

Open iamamoose opened 9 months ago

iamamoose commented 9 months ago

Apache does not require every submitted commit to have an associated ICLA. And where a ICLA is required places the onus on the person to deal with CCLA if required. This is quite different to what OpenSSL does right now. We've had a number of occasions where the CLA burden has been an issue.

This ticket is to look at what Apache and others using the Apache License require, capture where it's been an issue, and recommend if we should or can make changes that could lower the barrier to participation.

hlandau commented 9 months ago

So here's my personal opinion...

I have some small open source projects of my own. Sometimes I licence those under MIT, sometimes GPLv3+. While these projects are too small to really worry about licencing too much, I have considered the issue at times.

My basic view is that if you're licencing under something like MIT, I'm not too concerned about getting a CLA because it's unlikely you will ever need to relicence, and MIT is so permissive it may not even be an obstacle if you do. On the other hand, when I've considered making a project GPLv3+, I'm much more concerned about the risk I may have to change the licence later and have sometimes considered whether I should require some sort of flexibility from contributors here. One possibility I have considered at times, but never actually implemented, is to commit to my own projects under GPLv3+ but require external contributions to be MIT. I hesitated to actually do that since it feels a bit hypocritical, but the practical aim of the idea was to ensure relicencing was possible in the future if it ever became necessary without making people go through the pain of a CLA, so in practice not that different from a CLA in aims and purposes but less potentially frictional to the contribution process.

That's just something I've thought about at times.

Of course OpenSSL has had an experience of a painful relicencing process due to a lack of a CLA. I have my own perspective... the really major factor which informs my perspective here and my anxiety about the need to potentially relicence when using what I will call "complex" licences (which the GPL definitely is) is the Linux kernel.

I have my own personal opinions on the Linux kernel's licence. I am certain many will find them controversial and disagree with them, but nonetheless, it's necessary to explain my perspective on it: my basic view is that the Linux kernel's GPLv2 licence is routinely violated in a way that is tacitly sanctioned by the Linux project. I believe the reason for this is that the Linux project has essentially realised that there are usages that violate the GPL which are overwhelmingly in the interests of the project to allow. I believe that the legal reality of the GPL is different and that a 'vigorous and total' enforcement of the Linux's kernel licence would do damage to much of the Linux community and the project itself. I believe that the Linux project is basically aware of this and has chosen to fudge the issue by spreading FUD and trying to prevent people from spotting the legal elephant in the room by articulating "interesting" legal theories about the GPL.

I believe the reason the Linux project has done this is because they are essentially trapped between a rock and a hard place: they accepted contributions from innumerate people under GPLv2 (without the "or later" even), and now you have ominous situation where you have a billion dollar software project which is tied up under a licence nobody has the power to change, even if a serious issue with it in relation to the project and its community is discovered. Faced with this dire situation in which billions of dollars of engineering effort might be tied up under a problematic licence nobody has the power to fix, the project, in my view, has taken the path of essentially obfuscating about what the GPL means and engaging in a kind of GPL revisionism and culture of tacitly understood selective underenforcement of the licence - or some parts of it, anyway.

This, to my mind, is a sign of how very dangerous it is to tie up a software project under a complex licence with no plan for changing it if it proves necessary. And in truth, the FSF sort of agrees — hence why they always included the "or later" mechanism, which is basically in real terms a backdoor in a licence for the FSF themselves, in case they find some dire issue that needs addressing. But Linus noticed that "backdoor" and removed it. I'm not sure I blame him for that, but it probably needed to be replaced with something else to allow relicencing in an emergency.

Ultimately, a "complex" licence like the GPL without the ability to relicence is like shipping a complex piece of software and committing to never changing it after shipping, ever — something few developers would willingly do and for good reason, yet for some reason we countenance the idea for licences.

So my basic view is this: there needs to be some way for a given project to relicence if it has chosen to go with a "complex" licence (like the GPL or MPL or CDDL — two of which are involved in the ZFS fiasco for instance). That usually means a CLA (not copyright assignment, just a CLA allowing relicencing.) On the other hand, I'm a lot less concerned by a lack of ability of a project to relicence if it is using a "simple" licence like MIT. In that situation, I'd lean much more heavily in favour of avoiding the frictional forces that a CLA poses — and they are real. I admit this situation is blurred by the fact that the previous, problematic OpenSSL licence was not actually a complex licence, and more a case of a simple but accidentally incompatible licence. But I would probably say that the legal understanding of the simple licences in the FOSS community has improved since then and such mistakes are now less likely to happen.

So, is the Apache 2 licence a "complex" licence? I would lean towards "no". Then again, it is the second version of the Apache licence — i.e., it did have to be fixed once.

Now as for my views as regards the OpenSSL project specifically, in terms of what we do now: My gut feeling is that, because we had this painful relicencing experience before, we are overly focused on making sure that doesn't happen again, and have imposed a CLA process as a result of this. My feeling is that this might be "organisational scar tissue": a way in which a bad thing in the past might be undermining our agility in a way that is not proportional to the actual risk of having to face another relicencing in future. It is human nature to overemphasise a painful experience which happened to you in the past and focus excessively on the risk of it happening again, to the exclusion of a more rational assessment of the real risks. Our present CLA policy obviously does cost us some contributions. I'd be open to a discussion of how we can reduce that "organisational scar tissue". I don't have a firm position at this time on what that solution might look like, or even whether we actually do want to make a change. But broadly, there is always a risk after some policy is introduced after one bad incident that it turns out to be an overreaction that becomes "organisatonal scar tissue" and acts as a frictional force forever after on the organisation's execution of its mission. That's my gut feeling — we would need to discuss these ideas in relation to OpenSSL specifically for me to come to a concrete position. But I'm open and by default supportive of ideas of how we can address the ways in which our current CLA policy does and might preclude some contributions from some parties.

One thing I will say is that I agree with the principle that we should not be granting ad-hoc exceptions to the process because we see a feature we like. We need to decide and adopt a principled position and be consistent about it.

@iamamoose @t-j-h

iamamoose commented 9 months ago

My point of raising this is that we're using the Apache License (now) and our CLAs are from Apache, but as far as when we require CLAs and the process, we're not actually doing what Apache does. So if we're happy with Apache licenses and Apache CLAs and Apache lawyers who've looked at all that stuff, perhaps we should also be happy with Apache's interpretation of when a CLA is needed and when it isn't, and hence, this issue to have people review it.

there's a number of articles written about this and this one is oft cited: https://apetro.ghost.io/apache-contributors-no-cla/ and it links to various comments from board members of the ASF (and most importantly Roy Fielding who was a major contributor to the original Apache License).

ie. https://lists.apache.org/thread/0mytpqj7too29bj90yz65rggdv7gd35d and https://issues.apache.org/jira/browse/LEGAL-156?focusedCommentId=13554864

levitte commented 8 months ago

Oh interesting! So with Apache, the CLA is a repo write access requirement rather than a contribution requirement. Roughly speaking.

t-j-h commented 8 months ago

Reading through https://opensource.google/documentation/reference/cla and https://opensource.google/documentation/reference/cla/policy provides you with the Google Open Source Program Office view of the topic which covers a pile of ground that assists anyone attempting to educate themselves at a rather high level about this area.

t8m commented 8 months ago

One option would be to require either CLA or explicitly contributing the code changes under a MIT-like license that allows any subsequent relicensing by the OpenSSL project.

But yeah, I 100% agree that if we do any change to the requirements, it should not be a one-off exception for some particular contribution.

Sashan commented 8 months ago

Speaking of CLA also people who work in companies usually must ask for permission to sign CLA first. it's usually granted but it's yet another hurdle on the way to submit a patch. I like the idea of MIT (or BSD) like license for inbound contribution. It makes clear that the code submission is covered by MIT license and OpenSSL accept the code as such. This should make situation easier for individual who is employed in big company. In this case individual contributor asks legal department to contribute to OpenSSL project saying all changes I'll submit to OpenSSL will be covered by MIT license. This is something what legal department understands and grants permission. The CLA adds yet another hurdle, in my opinion.

t8m commented 8 months ago

Refer #431 for another case where the strict CLA policy might be doing us more harm than expected.

mattcaswell commented 8 months ago

@hlandau - you seem to make the assumption that the only reason we have a CLA is to protect us in the case that we want to re-license in the future. I do not think that this is the case. While the whole re-licensing issue was certainly a trigger for us implementing the CLA policy in the first place it is not the only reason for its existence (its not even the main reason in my mind).

The CLA gives us:

Getting this protects both us and our users from any future claims that we are distributing code that we don't have rights to. There have been cases where code has ended up in an open source project that shouldn't have been there. I certainly recall a case where someone had picked up code from another project and submitted it back to us without attributing the original author or telling us or them about it. In that case someone later complained about it afterwards and we had to retrospectively apologise and sort it out (fortunately it wasn't actually the original author who complained but a third party, and the original author was ok about it). CLAs don't protect us completely from this (that can still happen) - but they go a long way towards it.

The fact that our CLA also gives us the ability to relicense in the future should we deem it appropriate is a side-benefit, but not actually all that relevant. In reality it would be very difficult for us to do so. When we did the relicensing effort previously we did not get everyone to sign a CLA who had previously submitted code. We only sought their permission to relicense to Apache v2. Getting everyone to sign a CLA would have been too difficult. At the same time we adopted to the CLA policy for all future submissions so as not to "dig the hole any deeper".

t8m commented 8 months ago

The explicit grant of copyright and patent licenses could be done by other means - they could be explicitly associated with the source submission and not done via a separate CLA document which especially for CCLA I assume might sound to corporate lawyers as giving too much wildcard power to the employees covered by it.

iamamoose commented 8 months ago

By agreeing to make a contribution under the Apache 2.0 license the contributor is making a grant of copyright and patent license. That's why it's worth exploring if this is enough for non-committers.

hlandau commented 8 months ago

@mattcaswell — thanks for the perspective, definitely of interest.

There does seem to be a wide variety in terms of what an open source project considers adequate demonstration of submitting something under the project licence. Some of these feel like superstition to me, others feel quite vague:

So yeah, the ambiguity that a lot of (often smaller) projects on GitHub have never really satisfied me. Something like DCO could work but needs to explicitly mention the licence the code is being offered under, IMO.

Having said that, at least as regards this concern, I'm with @iamamoose here; I'm not convinced a CLA is essential or necessarily proportionate so long as we have some kind of unambiguous statement from a contributor about the licence. I don't consider either of the above an ambiguous statement, but even just a comment in a thread saying "I hereby licence this under Apache 2" certainly feels adequate to me. Whether that includes a broader set of contributors than "people who can (and will bother to) sign our CLA" is a question to be asked.

hlandau commented 8 months ago

It seems like external/perl/Text-Template-1.56 would not be admissible under our current policy. That seems like a problem to me, both in terms of consistency but also since it means we can never update it to a newer version.

iamamoose commented 8 months ago

It could always be mentioned under the PR template for example, i.e. https://github.com/gradle/gradle/blob/master/.github/PULL_REQUEST_TEMPLATE.md

Apache doesn't require this for Apache projects (but it could also likely be more obvious they're under the Apache license)

iamamoose commented 8 months ago

note also, Red Hat legal wrote this about CLAs: https://www.redhat.com/en/blog/keep-calm-and-merge-on-lowering-barriers-to-open-source-contributions-with-apache-v2

iamamoose commented 8 months ago

Writing this more as a proposal, how about something like

And the aim of this change is lower the barrier to participation, creating a welcoming community where casual contributions are welcomed. (Whilst it also reduces project overhead dealing with the CLAs, that's not a driving factor in this)

levitte commented 8 months ago

What about licenses that are compatible with AL v2?

It's not unheard of to have some set of files licensed with another license getting included into projects with their license preserved, when that license is deemed suitably compatible with the project license. Specifically, external/perl/Text-Template-1.56 was included under that sort of condition (research I did indicated that the Artistic License was compatible enough for our purposes, and an email exchange with the author confirmed that this was OK).

levitte commented 8 months ago

Regarding the license that the DCO refers to, they mean the license of the files that are being contributed. This assumes that each file contains the usual license boilerplate.

This entails, of course, that license compatibility becomes an important question for projects that use DCO for provenance tracking. In other words, if a new file is contributed, it may of course be rejected for license incompatibility.

levitte commented 8 months ago

It could always be mentioned under the PR template for example, i.e. https://github.com/gradle/gradle/blob/master/.github/PULL_REQUEST_TEMPLATE.md

Apache doesn't require this for Apache projects (but it could also likely be more obvious they're under the Apache license)

More or less all projects have a README file, and quite often a CONTRIBUTING file (which is hopefully referred to from README). Those are the usual goto's I keep track of.

mattcaswell commented 8 months ago

It could always be mentioned under the PR template for example, i.e. https://github.com/gradle/gradle/blob/master/.github/PULL_REQUEST_TEMPLATE.md

We could require a standard form of words to be written by the author into every pull request: "This submission is made under the terms of the Apache v2 licence", or, "I acknowledge and agree to OpenSSL's submission requirements given at ...some web address...", or something similar. The standard form of words could be part of the template. Alternatively people could still submit a CLA and not have to do that.

levitte commented 8 months ago

It could always be mentioned under the PR template for example, i.e. https://github.com/gradle/gradle/blob/master/.github/PULL_REQUEST_TEMPLATE.md

We could require a standard form of words to be written by the author into every pull request: "This submission is made under the terms of the Apache v2 licence", or, "I acknowledge and agree to OpenSSL's submission requirements given at ...some web address...", or something similar. ...

This is essentially what the DCO does, except that it refers to the license found in the files being contributed (which should usually be the project license)

hlandau commented 8 months ago

I'll spare you all my insanely pedantic Git hook script I wrote at one point which started automatically adding the SHA256 hash of a licence file to every commit I made: https://github.com/hlandau/acmetool/commit/f435a604ec5ad14ab7907c3ad24c2c676ba585d5

That was, uhm, a bit crazy. :rofl:

@iamamoose's proposal sounds like a good start.

IMO we should collect a list of CLA issues where our policy has prevented us from doing things and battle test any new proposal based on how well it would improve things, using that list as a benchmark.

beldmit commented 8 months ago

I remember the Eckiila case, and also my contribution inherited code from IDNA RFC was rejected despite claim "This code can be used in any way". My implementation was buggy and caused a HIGH CVE

Also I remember some cases when people dropped their PRs because they didn't want to sign a CLA