Closed ryao closed 5 years ago
https://github.com/zfsonlinux/zfs/pull/8305
ryao commented 2 days ago: I wrote a patch last year that would have allowed us to swap out the AVL tree code for Linux’s RB-Tree code for debugging purposes, although I never posted it because it turned out to be unnecessary for what I was debugging. The patch made switching between the AVL tree code and the Linux kernel’s RB-Tree code a configure script toggle.
Is this code available for review?
@stephenmclan I should still have it. I’ll find it and put it into a git branch after lunch.
Does Linux accept contributions that requires patches to be dual licensed?
@scineram When I spoke to Linus in 2014, he was not against merging a module that is under the CDDL, but the main issue blocking that was the lack of signed off. Unless he has changed his mind, then there should be no problem with getting dual licensed code into mainline. The mainline kernel already has code licensed under a dual BSD/GPL. The firmware blobs in the mainline tree are also not under the GPL either.
The idea of going with a dual license is an attempt to make peace with some of the more militant individuals at mainline. Open source has always been able finding ways to cooperate. Even if those guys are the ones antagonizing us, their hostility would cause problems at mainline, so we ought to try doing something about it through compromise (going for a dual license). My experiences with Linus in the past have been nothing but positive. The last thing I want to do is to ask him to help us establish a path to mainline inclusion without doing something to clear up the issues that would cause him headaches if he does.
@stephenmclan It needs to be rebased, but you can find it here:
https://github.com/ryao/spl/tree/avl
It worked on the older version. There was an accompanying patch to the ZFS repository (as it predates the merge), but I cannot find it at the moment. That patch was fairly trivial though.
I will rebase this on HEAD if people see value in it, although I don't see much need unless we are going to go forward with this. The ability to switch to mainline's rb tree implementation turned out to not be very useful for debugging purposes.
@stephenmclan I found the corresponding patch to ZFS to use the SPL patch:
Even if those guys are the ones antagonizing us, their hostility would cause problems at mainline, so we ought to try doing something about it through compromise (going for a dual license).
Does this statement take into account this LKML thread https://lore.kernel.org/lkml/20190110182413.GA6932@kroah.com/?
@ryao thank you for the code
@stephenmclan I was talking more about another person. I consider Greg and I to be on good terms. He is basically Linux's number 1 fan and it is hard for him to like anything that is not in mainline, but at the very least, he has never been hostile to me. We have had productive discussions on multiple occasions in the past. I would expect Greg to be agreeable to my proposal, but I decided to email Linus about it first. If Linus decides against it, there is no point in talking others.
Also, this is contrary to my usual policy of not posting about WIP things until they were ready. However, I feel that a discussion like this should be open to all community members, not just the copyright holders. Also, the decision to do a partial rewrite and change to a dual license is not one that I can see being made quickly. I am on-board with this because it was my idea, but I understand that others might need time to deliberate.
That being said, I think we have plenty to gain from making an agreement with mainline Linux. In particular, the resistance to OpenZFS adoption by some members of the Linux community is that OpenZFS is not in mainline. At least, that is the objection of one particularly influential Linux vendor. If we get into mainline, that goes away and OpenZFS will be in a better position to move forward. Furthermore, mainline would benefit from having a much better filesystem in tree. It would improve Linux's ability to compete with proprietary storage. Overall, I think the result would be a win for everyone.
My only questions are what happens when someone wants to improve ZFS but only under the GPL? Are they told no patches will be accepted unless they are dual licensed? Also what patents are used by zfs that oracle could use against users even with a rewrite… remember those cover an idea/method not code and the CDDL protects against that.
@beren12 We say no. If we switch to a dual-license for kernel code, then any code contributions to kernel code that are not dual licensed would be rejected. As for patents, this is why we need to talk to lawyers. I believe that we should be fine patent wise on Linux because Oracle is part of the Open Invention Network.
https://www.openinventionnetwork.com/joining-oin/oin-license-agreement/ https://www.openinventionnetwork.com/community-of-licensees/
My suggestion that we maintain some Oracle-owned CDDL kernel code is to protect other OpenZFS community members such as FreeBSD. They are sadly not protected by the Open Invention Network, which despite its name, only cares about preventing software patents from being used against Linux at this time.
Also, it seems like Oracle exercised a “limitation election”. My guess is that their patents after a certain date are not included in OIN, although it is not clear to me if patents after Oracle killed OpenSolaris are included under the CDDL either. We probably need an attorney to explain what that means and look through all of the patent stuff. We ought to see what Linus says before getting attorneys involved though.
I am not a lawyer, but I've been involved in this discussion for many years.
I think there are several serious difficulties in what you propose:
I think a better long-term strategy would be to convince Linus, Greg, and other key kernel people that the current ZoL code is a legitimate open source fork of the "bad old" Oracle code, and we should work together (or at least not be antagonistic) even if ZoL can't be included into mainline even if we would all prefer that. This is a relatively minor effort (i.e. not making core functionality EXPORT_SYMBOL_GPL()
). ZoL is not closed-source binary module or using shim code to facilitate that (e.g. NVidia, GPFS) or using binary firmware blobs that already exist and seem acceptable (e.g. network/video drivers, CPUs, etc.), so there is no good reason why they need to treat ZoL in the same manner or worse.
@adilger here are my thoughts on your points:
In any case, the status quo is problematic. We need to try to get along with the antagonistic elements at mainline better. If you are willing to talk to Greg, I am willing to talk to Linus. Talking to Linus about out of tree code is much easier than talking to Greg about it. Besides, I have already emailed Linus. ;)
For what it is worth, I have already sent an email to an attorney asking for his thoughts on this. I am waiting for a reply.
@kpande I am aware. I wanted to keep that statement simple because we are not at the point of contacting anyone yet. I still need to hear from Linus.
Another thing to consider. It would not hurt for those of us with contacts at Oracle to ask for them to arrange to provide signed off and relicensing under a dual license. If we replace their code, we will waste development resources that could go to improving it. Enough time has passed since they acquired Sun that it is possible that their priorities have changed. If they are cooperative, making peace with mainline would become very easy. We would just need the consent of everyone else, plus some relatively minor changes to the code compared to ripping out the Oracle code. I suspect that a rewrite is easier than getting them to cooperate though.
Having worked at Oracle in the past, my expectation is that Larry Ellison will not give anything away without a great deal of money involved, even if it would benefit Oracle in the end (e.g. they could use ZFS for OEL).
I think the best approach would be to convince the key folks on the Linux side (Greg, Linus, Christopher) that ZoL and the whole OpenZFS community is a fork of the last open-source ZFS code before Oracle closed off Solaris again, and has nothing to do with Oracle (and vice versa). Their misguided idea of "sticking it to ZoL" because Sun put ZFS under CDDL is in no way hurting Oracle, just regular folks who are using Linux and ZFS.
I don't think my talking with Greg would hold much weight. I don't know if you noticed, but he booted Lustre out of the staging tree, so I don't think he holds much regard for the work I've been doing for many years.
adilger
including a GPL-licensed version of ZFS into Linux may cut off FreeBSD and Illumos from any future development/fixes that are submitted directly to the kernel if they are only available under GPL.
This problem is a false problem. You only have to look at how DRM drivers deal with this very same problem. https://www.kernel.org/doc/linux/MAINTAINERS The Linux Mainline Maintainer of driver set the license of acceptable patches. So ZoL comes mainline with a maintainer from the ZoL project all patches to it can be under what ever dual license is required.
So you want to modify the intel/amd drivers and under linux kernel and don't like the MIT license that is use for freebsd and other platform support your patch is rejected from being include mainline Linux.
Linus refers to the Maintainers problem with Linux has herding cats. Each maintainer is allowed their own policies.
I think the best approach would be to convince the key folks on the Linux side (Greg, Linus, Christopher)
Getting into mainline Linux is only half the battle. Saying there required distributions to be happy to be activating and assisting in maintenance. So Greg and Linus can say yes and if distribution maintainers say no ZoL will not be going in or might go into staging and then be removed latter on.
ZoL is not closed-source binary module or using shim code to facilitate that (e.g. NVidia, GPFS) or using binary firmware blobs that already exist and seem acceptable (e.g. network/video drivers, CPUs, etc.), so there is no good reason why they need to treat ZoL in the same manner or worse.
Binary firmware blobs are a totally different problem there do not execute in kernel space these are not nightmares for scheduler. Binary firmware are not stored in the Linux kernel mainline tree. Instead are now split out into https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git. Closed source firmware is tolerated that is different to acceptable. Remember you have the Linux-libre group out there who would love to see no closed firmware blobs and the fact these firmwares are behind a wrapper/loader these days.
ZoL like it or not being CDDL only license is in no different location than Nvidia/GPFS due to be an incompatible license. This means you need to apply for GPL flag symbols to be changed. Newly added features will either be __ what is for internal Linux kernel usage only or GPL symbols either way without putting in request to LKML you should not be using them.
The fact you don't have a wrapper/shimming is part of the cause of some of the extra duplication.
Reality there is no real different in issues and handling between something like ZoL that is open source and something like the Nvidia binary blob with wrapper both are case of incompatible license so require performing the same set of procedures.
Really stop thinking that you can just get Linus, Greg or other lead developers to agree. What is need is for you to learn the processes you need to-do. Either work how to become Linux kernel license compatible to be merge mainline or bite the big one and except you need to make shim/wrapepr or(or/and) negotiate for particular symbols to be changed from GPL exports to normal exports based on need not want.
Lose the idea that Open Source is import point. The most important point is legal license compatibility by one means or other without this you will always run into trouble.
I was going to lock this thread to contributors only but it appears some people don't have the permissions, so I'm going to unlock it again. I would like to ask that if you are not actively contributing to the project, to please use IRC or the mailing lists, as armchair commentary only clogs up github. Thank you.
While I agree that the discussion will be most productive if restricted to the community members that are actively (or semi-actively as I admit my activity lately has been low) involved in the development of OpenZFS, I think that if there is a viewpoint that has not been voiced, then it is a good thing for it to be said for consideration. However, I ask that people who are not directly involved with development have discussions in places like IRC about whether there is more to add before posting here to minimize clutter.
Anyway, I am still waiting on a reply from Linus. It might be that he is not interested in having that discussion right now. In that case, it would be up to him when we could begin discussion and this will stall until then. There is no way that we are going to do any of this unless Linus is willing to give us a path to mainline to justify both the legal review and technical work needed to achieve that.
If it turns out that the plan fails legal review (as Andreas likely believes it will), then at the very least, we will have tried, and hopefully, the establishment of a willingness to work together would achieve the outcome that Andreas suggested to be realistic. In any case, I think we should try to engage in dialogue. Having certain individuals reply with hostility every time ZFS is mentioned on the LKML under the status quo is ridiculous. Given the CoC changes at mainline, this ought to be a good time to try resolving that.
Strategy question: Can someone build a business case for going into the mainline? Personally, I see no advantage, but lots of disadvantages.
Tactical question: There is an effort underway in many open source projects to try and prevent companies, like Amazon, from monetizing products built on their projects. These will fail, but the collateral damage is an invasion of the license-Nazis. Therefore, to inoculate this project, it may be necessary to reduce exposure to Linux kernel dependencies. This is also inefficient and arguably poor code management, but it prevents problems that occur when others change their interfaces.
@ryao Personally, I would love to see OpenZFS move to a dual/multiple license model for new contributions, and try to relicense old non-Oracle contributions. Obviously, GPL-2 compatibility is critical, but the license doesn't have to be specifically CDDL/GPL-2. CDDL/BSD might be better for FreeBSD, for example, and would still be GPL-2 compatible (presuming it's a BSD license w/o advertising clause, of course). Aside from the work to find everyone, there isn't a big downside, and it leaves things open in the event that either a re-implementation or Oracle relicense happens. The sooner this happens, the better, as that reduces the amount of relicensing.
richardelling "Can someone build a business case for going into the mainline? Personally, I see no advantage, but lots of disadvantages."
https://01.org/lkp/documentation/0-day-test-service You pick up stuff like this by being mainline. Mainline Linux you get more broad automated testing of your code by multi different parties looking for coding defects lot of those parties have zero current involvement with the ZoL project..
You will also pick up early notification when you are effected by a API breakage and simpler fixing.
http://coccinelle.lip6.fr/ The Linux mainline due to use coccinelle is not major pain for mainline driver developers with GPLv2 compatible licenses. API changes from 1 name to another with the same functionality even if arguments are different SmPL (Semantic Patch Language) item can do the transformation.
Yes this simpler fixing will apply if you can properly address the license issue even if you never make ti to mainline.
"Therefore, to inoculate this project, it may be necessary to reduce exposure to Linux kernel dependencies." This is a idea that not going fly long term unless you are willing to go to user space. Its really easy to over look mainline Linux kernel is using coccinelle and this is what allows them to repeatily rewrite large sections of the internal Linux kernel API without causing the mainline kernel developers much workload.
Its also simple to attempt to ignore what happened with kernel_fpu_begin coming kernel_fpu_begin. This was no problem for GPL compatible license modules but a nightmare for incompatible license ones. Here the catch those two are not absolutely identical. The new kernel_fpu_begin has locking where kernel_fpu_begin does not.
"This is also inefficient and arguably poor code management, but it prevents problems that occur when others change their interfaces." If you don't know about change of interfaces in the Linux kernel you could have been using like that old kernel_fpu_begin and kernel_fpu_end only to have the cpu scheduler run some other task in the middle of your operation ruining the state you were depending on causing your driver to come crashing down at random on your poor users. So that case you need to be thankful that it failed to build.
The reality here you cannot operate in kernel space without the kernel schedulers known what in heck your are upto and how their behaviours have been changed. The Linux kernel schedulers is GPLv2 code.
Nvidia had this same logic remember the Linux kernel changing from 8kb pages to 4kb pages. There is almost a unlimited number of ways the mainline linux kernel can reach out in kernel space and stuff your driver without changing the kernel API just by scheduler alterations. The idea that you can isolate is the big bad mistake. Yes this Nvidia case was 1998 2 decades ago. Linux mainline people respond the same way for over 2 decades. You are not GPLv2 compatible their care about notice to you about breakage drops. Mainline makes sure you will get notice early on in the automated build process.
Linux kernel requiring kernel drivers to be GPLv2 compatible to use all kernel features is not a new thing it has only been that way for over 2 decades. This is not license-Nazis. This is legal reality.
The CDDL of ZoL really just as been on going ignoring the legal reality this does lead to hostile in the mailing lists.
ryao "Having certain individuals reply with hostility every time ZFS is mentioned on the LKML under the status quo is ridiculous. " Coming back to the LKML with the same license defect with no signs of address it has also be status quo is ridiculous in the ZoL side. It take two to tango.
Please note I am not saying that ZoL has to be GPLv2. Just some license that is GPLv2 compatible on the Linux side. I don't know having a license that better matched to FreeBSD would also get you Freebsd automated testing and notification of changes it could be the case there as well.
There is a high risk CDDL holding the development of ZFS drivers back. Without a question not being mainline Linux is causing ZoL not to get as much automated code testing resulting in late notification of upcoming kernel changes with Linux kernel. Problem is I am not a Freebsd person this problem could be happening in freebsd as well.
Working around license incompatible causes a lot of waste of resources and stuff ups.
@richardelling A precise business case is hard to make. Here is what going into mainline gives us:
I think that the network effect would give us synergy, but it is hard to say exactly what that would be. No one imagined ZFS would be ported to so many platforms or so widely used when development just started and similarly, I do not think I can quantify the long term benefits of this. I suspect that it should be enough to pay for the cost of trying to compromise, provided that mainline Linux is willing to collaborate.
Coming back to the LKML with the same license defect with no signs of address it has also be status quo is ridiculous in the ZoL side. It take two to tango.
@oiaohm The only actual issue is the intolerance by a few people of other OSS code. They treat Nvidia better than they treat us and Nvidia’s blob is not even OSS. I do not subscribe to the idea that a kernel module that is not a derived work of Linux (having come from OpenSolaris) triggers the GPL’s derived works clause. I also do not subscribe to the idea that you become a derived work by implementing a bunch of compatibility shims against documented public driver APIs, even if those APIs are unstable. Multiple lawyers have confirmed this to me and I am going to listen to the ones whom I have already consulted on that.
A rewrite of a mature codebase just to satisfy a few people’s preferences in OSS licenses is an insane thing to do. It is not clear if it is even possible without a ton of legal review. Also, a rewrite even if done over a long period is still rather disruptive to development. I am only willing to suggest such a thing if people at mainline were willing to put past attitudes behind us and start collaborating. If my offer to try evaluating this route is rejected, then honestly, there would be nothing that we can do to please people at mainline and I will be sure to cite that whenever someone claims these sorts of things are somehow our fault and not mainline’s.
That being said, Linus does not appear to have replied to me. If I cannot get a dialogue started with mainline on this, I am going to assume that I did everything I could and drop the idea. I am not going to advocate for a self inflicted wound to enable us to dual license with what is a few people’s preferred license if mainline is not going to commit to do anything in exchange.
The reality here you cannot operate in kernel space without the kernel schedulers known what in heck your are upto and how their behaviours have been changed. The Linux kernel schedulers is GPLv2 code.
@oiaohm As long as we have a mechanism to disable preemption (and possibly interrupts), how the scheduler works is irrelevant. We can reimplement this functionality ourselves without the symbols. The symbols in question are really just wrappers for ISA instructions for saving and restoring registers.
Linux kernel requiring kernel drivers to be GPLv2 compatible to use all kernel features is not a new thing it has only been that way for over 2 decades. This is not license-Nazis. This is legal reality.
@oiaohm It is not reality according to attorneys with whom I have spoken. It is the case that some people want it to be that way, but thankfully, I am told that copyright law does not work the way that they claim. I am willing to talk to mainline to see if we can compromise and I am willing to discuss here the actual issues involving possible compromises (e.g. technical challenges, patent grants). I am not willing to reject what my own attorney has told me regarding how copyright law affects OSS. This desire to compromise is solely the result of my wish to see the larger OSS community collaborate constructively, and has nothing to do with some “legal reality”.
@rlaager I suspect that the Illumos community would find the GPL (or even the LGPL) more acceptable than a BSD license. When Illumos was started, they stated that they do not want Oracle to be able to use their code without Oracle giving their changes back.
Tactical question: There is an effort underway in many open source projects to try and prevent companies, like Amazon, from monetizing products built on their projects. These will fail, but the collateral damage is an invasion of the license-Nazis. Therefore, to inoculate this project, it may be necessary to reduce exposure to Linux kernel dependencies. This is also inefficient and arguably poor code management, but it prevents problems that occur when others change their interfaces.
@richardelling I am not sure what the question is here. We are certainly approaching a reduction in API usage to the absolute minimum possible over time given all of this breakage.
I guess the renewed hostility at upstream is a result of this. In the case of companies like Amazon, the software typically runs on their infrastructure such that the GPL is effectively a BSD license. You would need the AGPL to do something about that, although I have no problems with companies such as Amazon adopting ZFS internally. Quite frankly, the purpose of making code open source is so that it is easier for others to adopt it. Trying to prevent that seems counterproductive.
Also, trying to keep certain companies from using OSS would be a violation of the open source definition’s clauses against discrimination:
I suspect that the Illumos community would find the GPL (or even the LGPL) more acceptable than a BSD license.
I think this entire thread is, frankly, not a good idea -- but I'd like to note that there is no evidence to support this particular claim. We're generally quite happy with the CDDL, and as a rule aren't looking to increase our GPL footprint at all.
@jclulow The issue was whether Illumos would prefer CDDL/GPL or CDDL/BSD in a dual license scenario, not CDDL vs GPL.
Perhaps I was not clear.
well, fwiw, I've only contributed one patch to ZoL, but re-license it as you please.
@jclulow I was not talking about GPL vs CDDL vs BSD license. I was talking about CDDL/GPL vs BSD/CDDL. I just wrote BSD vs GPL in the context of them being the second option in the dual license, with the CDL being the first, but I can see how that was unclear. To be more clear, I am proposing that we go to a dual license of the existing CDDL and a second license if the following conditions are reached:
There are existing projects such as Netbeans, Firefox and OSSv4 that give users a choice of open source licenses. The idea here is to investigate the possibility of becoming one if it would make peace with mainline Linux. If it were both possible and happened, then the Illumos community would be able to maintain the status quo where it only has to worry about the CDDL. The only real difference is that contributions back to the codebase would need to be dual licensed (such that others may choose the alternative license).
From an end user perspective, a dual license ought to be treated as being identical to the one of the two that they prefer. However, I can see that some people don't view it that way. Perhaps the CDDL/LGPL would be better than the CDDL/GPL. I was under the impression that the Ilumos community wanted to force those distributing binaries to provide sources because of what happened when the OpenSolaris project was discontinued. Am I wrong to think that?
That being said, none of this (including the legal review) is going to happen unless mainline Linux is willing to talk and explore possible ways of enabling constructive collaboration. I just put it out there prematurely because I felt that such an effort ought to be public such that if anyone were strongly opposed, it could be dropped before too much work went into it (provided mainline Linux were willing to collaborate). Making new friends is great, but the existing community comes first.
@ryao what it is not clear to me is on what grounds a shim for GPL-only functions is opposed by the FSF, especially when such a shim was proposed as a reasonable approach in the past: https://lwn.net/Articles/154602/
Can we reconsider this position with mainline before going the full-rewrite route for such functions?
Thanks.
@shodanshok We are talking about a partial rewrite if mainline agrees to cooperate and it passes legal review among other things. Those functions would be useful to reduce code bloat, but the reason for my attempts to negotiate a compromise is to try to end hostilities so that we can work together, not to get access to GPL exported functions. It has been a pattern of behavior at the LKML for years and quite frankly, I am tired of it. What the FSF thinks about using GPL only symbols is irrelevant to that.
Having access to those symbols does not hurt, but it does not help much. You can build ZFS into your kernel (part of vmlinux, not as a module) to see what having access to those symbols is like. There is no measurable improvement. Right now, the only real advantage is that operations on .zfs directories are slightly more efficient.
There is the regression with Linux 5.0’s SIMD acceleration, but that just takes away a wrapper around a couple of CPU instructions. We do not need to touch GPL exported symbols at all to use those machine instructions on any Linux kernel. It will be fixed as soon as one of us spends the 3 hours needed to write a patch making our own wrapper. If we are clever, we could omit unnecessary registers from save/restore, which would be faster than what mainline’s wrapper does.
I'm sorry to interfere with the discussion (as a long time ZoL user, I would totally love to have it mainlined, by the way, if it is actually legally feasible to do so), but have you considered to evaluate the MPL v2.0? I'm not a lawyer, but one of its best features is that it's fully compatible with GPL v2 (see section 3.3), and, given that the CDDL was fundamentally just a modified version of the MPL v1.4, it should't be that different for the BSDs and Illumos compared to the current situation, or am I wrong?
That being said, none of this (including the legal review) is going to happen unless mainline Linux is willing to talk and explore possible ways of enabling constructive collaboration.
Pretty bold statements in this thread. Keeping ZFS out of mainline is not a bad thing, at the very least for the the reasons you've stated @ryao Not sure how ceding further control to an increasingly hostile and centralized governance is benificial for anyone.
the CDDL was fundamentally just a modified version of the MPL v1.4
Wasn't it MPL v1.1?
@jamie-arcc sorry, it was a typo, I meant MPL 1.1. There is no such thing as "MPL 1.4".
Since mainline was not willing to talk, there is no point in trying to pursue this. I am closing it.
Hi there. I hope some day ZFS will be included in mainline. I am tired to wait new zfs release to update my kernel. ZFS is best filesystem in the world, and this burocracy must be resolved. Let's force it somehow. If Linus includes firmware blobs, why cddl is an issue. ZFS must be in kernel tree!
The kernel has nothing to do with ZFS not being upstream. The problem is that the licences of both are incompatibile and unless you get thousands of potential copyright holders to agree to re-licence ZFS under a different, GPLv2 compatible licence, it's not possible without opening the kernel to litigation. This means it will never happen, unfortunately.
unless you get thousands of potential copyright holders to agree to re-licence ZFS under a different, GPLv2 compatible licence
... or get thousands of potential copyright holders to agree to relicense Linux under a different, CDDL-compatible license 😉
Projects can and do use multiple licenses. There really isn't an exclusion here. The problem is political, hence has no solution.
Projects can and do use multiple licenses. There really isn't an exclusion here. The problem is political, hence has no solution.
Yes, of course that's possible, but it's not only due to politics unfortunately. The option of dual licencing would have been great but sadly that ship has left port a long time ago. Unless you get every single contributor to sign a copyright waiver, in order to change a licence of any project you have to identify code ownerships and then track down every single author since day one - an herculean task for very large projects, unfortunately. Some people may have passed away, and businesses may have closed down, making it impossible to find who owns an IP after years or decades. So yes, politics play a role but there's a big practical side to it too.
Recent events WRT Linux 5.0 have made me reconsider user requests to pursue mainline inclusion. Linus Torvalds told me in person in 2014 that he requires signed off from Oracle to merge the code. That is not happening, but it occurs to me that it should be possible to replace all Oracle copyrighted kernel code with new code over a long period of time (several years). This would bypass the need for Oracle’s signed off. It would also give us the ability to switch the kernel module to a dual CDDL/GPL, which would allow us to make peace with certain individuals who dislike non-GPL kernel code.
ZFS is fairly modular, so replacement could be done piece by piece. This would risk losing the patent grants afforded to us by the CDDL (although OIN might cover us on Linux). We would likely want to maintain two copies of certain key code in the places where a patent grant is required. This would be a maintenance burden, but the balancing act would likely help us on Linux while maintaining the suitability of the main repository as the upstream for the other OpenZFS implementations. Of course, there is no point to putting ourselves through that unless we are allowed to become an official part of mainline, so that is subject to getting some sort of agreement with mainline.
As for what that would look like, I envision the current form of development continuing with expanded regression test infrastructure. The current repository would allow building either a CDDL version with the bits we still use or a CDDL/GPL version. We currently are able to merge our code into a Linux source tree and we would basically polish that to create the mainline version. The dual licensed version of the module would be what goes into mainline while the person who volunteers to be the mainline maintainer would do regular resyncs from our stable branch. All pull requests to the mainline driver would go through this repository and any changes made by mainline to the kernel API would be replaced with our compatibility shims in the pull requests to keep the code in sync.
Some people at mainline might not like the presence of compatibility shims in the code, but it is not unprecedented. KVM had shims for quite some time to allow newer versions to be built on older kernels. XFS has its own compatibility shims from its origins as a port of a filesystem driver from IRIX. If we look, we probably can find other examples.
We would need to stabilize the /dev/zfs API for that and if we can get it under the terms of a dual CDDL/GPL, I suppose my old work from ClusterHQ could be completed to do that. We would need to expand our architecture support to include all of the mainline architectures, even if we only officially regression test a few. We would also need support for importing the root pool at boot, but this is doable (with an initramfs) by implementing import logic into the kernel module. As for replacing various components, there are already replacements for some:
I do not see us replacing the header files, but outside of macros that are too trivial for copyright protection I do not think this is a problem as they do not count as code. Honestly, I would prefer it if we could receive exemptions from mainline to continue using C99 plus the Sun code style, but if not, we could evaluate semantic patching for resyncs with mainline. However, that would make it a pain for downstream kernel maintainers to cherry pick patches and for them to send patches back.
The userland code would remain under the CDDL and we would continue using Sun Oracle code there. We would likely want to port the kernel components to userspace for ztest to ensure that we can properly test it and that version of ztest would be encumbered such that we cannot redistribute it in binary form, but it would just be for regression testing on the buildbot, so it ought to be fine.
There would still be an enormous amount of code to rewrite and doing that in a mature codebase without introducing regressions would be best done slowly to avoid negatively affecting development. I imagine rewriting ZIO and SPA would be especially challenging. Ignoring those difficulties (that likely could be resolved over the long term), we would need the following for this to be feasible:
Anyway, that is what I am thinking. Comments are welcome. I have already emailed Linus with an outline of the idea. If he says no, then we can honestly tell users that the decision to exclude ZFS from mainline was made by mainline rather than ourselves or Oracle. That would have the silver lining of allowing us to kill the misconception that Oracle can unilaterally relicense the code, which completely ignores the reality that the rest of us must also agree.