ubports / ubuntu-touch

Ubuntu Touch's issue inbox is now migrated to GitLab.
https://gitlab.com/ubports/ubuntu-touch
1.28k stars 110 forks source link

[RFE] Automated Kernel CVE Patching #1566

Open SkewedZeppelin opened 4 years ago

SkewedZeppelin commented 4 years ago

Description of the feature

I created/maintain a CVE patcher program and repository of Linux CVE patches. It is primarily for use with my project DivestOS. However it can with little changes be used with any similar projects.

This is more to start a discussion about possible implementation. Questions/Feedback welcome.

Links: Program (GPLv3): https://github.com/Divested-Mobile/cve_checker Patches (GPLv2): https://github.com/Divested-Mobile/kernel_patches Patch List (GPLv2): https://raw.githubusercontent.com/Divested-Mobile/kernel_patches/master/Kernel_CVE_Patch_List.txt Known Patch Incompatibilities (GPLv3): https://github.com/Divested-Mobile/DivestOS-Build/blob/master/Scripts/Common/Fix_CVE_Patchers.sh

Illustrations

https://gist.github.com/SkewedZeppelin/ad292d9e7dd3cd873805d2587670717a https://gitlab.com/divested-mobile/divestos-build/-/tree/master/Scripts/LineageOS-14.1/CVE_Patchers https://gitlab.com/divested-mobile/divestos-build/-/tree/master/Scripts/LineageOS-15.1/CVE_Patchers https://gitlab.com/divested-mobile/divestos-build/-/tree/master/Scripts/LineageOS-16.0/CVE_Patchers https://gitlab.com/divested-mobile/divestos-build/-/tree/master/Scripts/LineageOS-17.1/CVE_Patchers https://gitlab.com/divested-mobile/divestos-build/-/tree/master/Scripts/LineageOS-18.1/CVE_Patchers https://gitlab.com/divested-mobile/divestos-build/-/tree/master/Scripts/LineageOS-19.1/CVE_Patchers https://divestos.org/index.php?page=patch_levels#devices

Other Discussions:

Flohack74 commented 4 years ago

I don´t think this will work on Android kernels, you will get a ton of conflicts each time. Google chose to diverge from mainline kernels a lot, then on top vendors did apply incomplete patches, or applied them wrongly, in the end even with simple patches one does already struggle with manual patching. I cannot imagine any automatic thing being able to sort all of this out. Also, automatic things could introduce issues for a device that go undetected for some time. Our QA team would need to test each device manually for the sake of stability after every time the patcher kicked in. I can see there are scripts for each device, that's a lot of maintenance. I can see the motivation for this idea but honestly this will be too much for our small project.

SkewedZeppelin commented 4 years ago

It very much does work because I've been using it in my DivestOS project for the past three years.

The program has explicit support for AOSP workspaces. The patch database contains a vast amount of AOSP and Qualcomm specific patches in addition to mainline Linux ones.

Flohack74 commented 4 years ago

Ok please demonstrate against this 2 kernel repos: https://github.com/ubports/android_kernel_oneplus_msm8974 and https://github.com/Flohack74/android_kernel_huawei_angler :)

SkewedZeppelin commented 4 years ago

There are 4 UBPorts repos here including oneplus_msm8974. https://gist.github.com/SkewedZeppelin/ad292d9e7dd3cd873805d2587670717a Those are ubp-5.1 branches. Any other branches you want?

Here is a patcher for Flohack74/android_kernel_huawei_angler/halium-7.1 https://gist.github.com/SkewedZeppelin/16de49aa8725b69141abce82a32f173c

fredldotme commented 4 years ago

Would this also work for kernel repos where, instead of merging individual commits, I applied the patch from kernel.org onto the kernel repo?

SkewedZeppelin commented 4 years ago

@fredldotme can you clarify/elaborate what you mean?

If you mean incrementals, it has support for that but it makes no attempt to resolve conflicts.

fredldotme commented 3 years ago

If there is a way to tell the script(s) to ignore/revert conflicting patches then I might consider it for the 4.4 based suzu (Xperia X) kernel as a test-drive. AFAICT it just generates a script which then tries to apply the necessary CVE patches. Shouldn't be too hard to add an || git reset --hard HEAD to it if fails.

SkewedZeppelin commented 3 years ago

to it if fails.

The program outputs a script which only contains patches that were checked to apply successfully.

https://gist.github.com/SkewedZeppelin/e3b465bc9ed02f7127992e910580a9d3

UniversalSuperBox commented 3 years ago

Hey @SkewedZeppelin, I think this is really amazing work from what I've seen so far.

Could you walk me through how you use this software from start to finish? For example, is Kernel_CVE_Patch_List.txt collected manually then fed into the program with a link to a repository to create the CVE patcher script? How do you then run the CVE Patcher script? Is all of this automated in a way we can see?

I've been looking through the DOS build scripts and it appears that the CVE patchers in https://github.com/Divested-Mobile/DivestOS-Build/tree/master/Scripts/LineageOS-17.1/CVE_Patchers (for example) are generated by cve_checker, then committed to git. Is that the case? What am I missing here?

I'm pretty sure that this is an absolute must to implement if we can get it working.

SkewedZeppelin commented 3 years ago

@UniversalSuperBox

I've just done a big refractor to make the tool easier to use outside of the DivestOS build scripts. I've also added some basic help/commands to the README. It should be enough to get you started. Any questions feel free to ask.

SkewedZeppelin commented 2 years ago

It has been over a year since this issue was opened. I have updated this gist and the numbers are horrifying. https://gist.github.com/SkewedZeppelin/ad292d9e7dd3cd873805d2587670717a

If your focus is on mainline devices like the Pinephone, then so be it. But if people are daily driving these older devices, then something needs to be done.

UniversalSuperBox commented 2 years ago

Hi @SkewedZeppelin, thanks for your feedback. I agree that something should be done to improve the security of devices that Ubuntu Touch ships on. I'm very happy that you were able to show us the output via a PR to one of our projects. I also know that I said a year ago that "this is a must to implement if we can get it working." However, after reading the changes that your script has made in https://github.com/HelloVolla/android_kernel_volla_mt6763/pull/10, I see that this approach is very difficult to recommend as the default way we maintain our kernels. This is for a few reasons.

You could toss in a complete remote code execution vulnerability through a combination of multiple patches in your repository, and no one would know. No one would see when reviewing each patch individually. In a +5000-2000 review, it'd be missed entirely. Not that I don't trust you, but I also don't want to have to. Also, combining all of the patches into a single commit makes it impossible to see why the change was made. It's also impossible to revert only a single patch if it causes a problem. The script should be bringing in commits so they have the same content (metadata and diff) as kernel.org.

Additionally, most of the changes that the tool has made are outside of the code built by the defconfig for the phone. An ARM64 device does not need CVE patches from x86 or s390. These changes do not need to be merged into the kernel tree at all, and are only noise for the review process. The script should learn how to tell if a CVE likely applies to a device and decide whether to patch it based on that.

There's another level of review, and that's the review that must be done to make sure the patches applied don't affect other patches in the kernel tree. Not just by conflicting, but by causing unexpected behavior. The script cannot learn how to determine this, it must be a manual review. If manual review fails on the whole set, it must be bisected or, more likely, reapplied patch-by-patch. It's also important to see if a CVE is exploitable and ensure that it is not exploitable after the patch is applied. This leads back to the manual check-review-test-merge-recheck process which is how long-term supported Linux distributions maintain their software with security fixes.

For these reasons, I would not merge the output generated by this script today. Nor would I run the script myself. Correcting the patch metadata and ignoring useless patches is possible and might not be too hard of a problem. Doing the review is likely impossible for an automated system, and that's where most of the work in security patching comes from anyway. What I'm most afraid of is that we'll have a false sense of security, leading people to believe that we are safe from the given CVEs even if we are not. A false sense of security is worse than a true, known hole in my mind.

What's worse, we will have spent a lot of time to create that false sense of security rather than actually fixing the underlying problems. There are a couple of ways to fix the underlying problems, too. One is even tractable.

The first way forward is to "upstream" the downstream kernel (scare quotes used for a reason). This process involves taking kernel.org's latest release for the series and re-applying Mediatek and Volla's patches until it works correctly again. Then it is possible to continue rebasing the patchset on top of the latest LTS releases. For 4.4, those will continue for a while. This certainly isn't easy, but it's a mostly known quantity of work.

The second is to mainline the device. Obviously this is not a tenable solution for most devices.

It appears that you have noted all of these problems in your documentation for cve_checker. You rightly ask us, then, if the benefits outweigh the risks. Given the output I see today, I'm not sure they do. There is a very real risk that the automated tool creates more or worse security issues than it resolves, and it does not get rid of the real work in maintaining security fixes for downstream kernels.

SkewedZeppelin commented 2 years ago

I'm very happy that you were able to show us the output via a PR to one of our projec

PR solely for demonstration purposes. I'd rather see the patches git am'ed and selectively applied by hand to filter unrelated patches.

You could toss in a complete remote code execution vulnerability through a combination of multiple patches in your repository, and no one would know

The process to generate the repository on your own is documented in the program README.

most of the changes that the tool has made are outside of the code built by the defconfig for the phone

See first comment.

It's also important to see if a CVE is exploitable and ensure that it is not exploitable after the patch is applied

How do you propose to do this? That is an enormous undertaking requiring staffing and infrastructure.

I would not merge the output generated by this script today

I wouldn't either, see first comment.

we will have spent a lot of time to create that false sense of security rather than actually fixing the underlying problems.

More then a false sense of security than running completely insecure kernels?

This process involves taking kernel.org's latest release for the series and re-applying Mediatek and Volla's patches until it works correctly again. Then it is possible to continue rebasing the patchset on top of the latest LTS releases.

This would be an excellent start, but what about the other devices you list on your promoted devices page?

For 4.4, those will continue for a while.

4.4 is EOL come January, a while in my book is not 3 months. Unless you plan on rebasing onto the CIP 4.4 branch or have another source in mind.

The second is to mainline the device

This is simply not an option today. Yes there are some mainlined devices, but none of them are daily drivable.

Given the output I see today, I'm not sure they do.

Maybe that should be up to your users instead. For all the downsides that my program has I don't see anyone else bringing anything to the table that accomplishes what it does. Nor do I see anyone else wanting to implement improvements unto it.

--

At the very least I ask you to rebase your trees onto the newer LineageOS branches where available. They've been conservatively patched by hand.

Lastly, here are two repos that have taken the approach in the first comment: https://github.com/lin18-microG/android_kernel_oneplus_msm8996/commits/lin-18.1-mse3, https://github.com/lin18-microG/android_kernel_sony_msm8974/commits/mse_v1

UniversalSuperBox commented 2 years ago

How do you propose to do this? That is an enormous undertaking requiring staffing and infrastructure.

Precisely. What you are proposing is a band-aid over a buckshot wound inflicted with a 10-gauge shotgun fired from the hip at a range of 5 feet. What is really needed to fix the insecure device problem is not a band-aid. The primary work required to patch the kernel is not finding the patches, it's the testing to confirm the CVE is fixed and there weren't any regressions in the huge body of software running on top.

I think that makes me more than a little defeatist. What you've created is a great repository of patches that could be required by a given kernel repository. That's invaluable in and of itself. But you haven't removed the requirement of applying all the patches, testing the result, and bisecting the lot if it breaks.

Your argument seems to be that we should be doing that work, as a responsibility to the people who use Ubuntu Touch. In that case I agree, but I don't know who will actually do the work. We can work on making Ubuntu Touch exist and grow at all, or we can work on security patching what is becoming increasingly older until no one cares any more and it dies out. Doing both would be ideal (except the dying out part). That is not the reality we exist in.

What we need is not a system that automatically adds more work to our plates, it's someone to help with the work. If you want to be the person who git ams a whole lot of patches and tests them on the device in question, I would be so, so happy to have you and help you find everything you need. We can even talk about how we can get people excited about security patching for their older devices, so neither of us has to do it. But I think that any solution that involves running an automated script to take a kernel in a known state and put it into an unknown state, then ship that, is a non-starter.

At the very least I ask you to rebase your trees onto the newer LineageOS branches where available. They've been conservatively patched by hand.

Also a great idea. My next six months are pretty well scheduled, though, how about yours? Or anyone else reading this ._.

SkewedZeppelin commented 2 years ago

eh, I just realized you can't even git merge the Volla repo with AOSP kernel-common, as it has squashed commit history.

Fuseteam commented 2 years ago

i am reading this, but it feels above my current skillset. i'll leave this comment (for now) to bump this issue up for the more skilled with available time to see it. there should be an structural way we can tackle this issue

Flohack74 commented 2 years ago

I think this will never work, tbh :)

SkewedZeppelin commented 2 years ago

never work

Have I not repeatedly proved it can work? It boots on 70+ devices.

And I've put a lot of work into the database the past year to minimize the number of diffs so that it can be git am'ed, see here these:

I kindly don't understand why anyone maintaining/using ancient kernels wants to ignore this program. I see no one else providing anywhere close to a solution for them. It only directly hurts your users.

edit: Point me to a kernel and branch and I'll make another PR, not squashed. Are these ones fine from the flagships on website?

Fuseteam commented 2 years ago

Have I not repeatedly proved it can work? It boots on 70+ devices.

If booting was the only worry, I'm sure we would have implemented this already, it is much much more than that.

one example, if a kernel bug breaks call audio that has to be caught before someone fails to call am ambulance in an emergency. we cannot have some user critical function breaking under any circumstance ;)

perhaps an example in order: the postmarketos are doing an outstanding job at mainlining android devices, they have over 200 of them already booting on mainline linux. However as wonderful of a job they have done, as of right now 4 years in the making, not one is ready as a daily driver device

I kindly don't understand why anyone maintaining/using ancient kernels wants to ignore this program. I see no one else providing anywhere close to a solution for them. It only directly hurts your users.

I think that's the real issue, we currently aren't really maintaining atm because there is so much work involved and so little volunteers to do that work. there is even some talk now and then if support for certain devices should be dropped, even some of the devices you've listed ironically enough.

But let's put this in perspective, there are a handful of developers (literally, you can count them on one hand) that are maintaining the entire graphics stack, the entire build infrastrucuture, all 400 repositories across github and gitlab, managing pull requests, triaging issues, debugging bugs in the operating system, upgrading from 15.04 (yes ubuntu touch is still based on 16.04) to 20.04, migrating all the repositories to one platform (gitlab) moderating all community groups, informing the community of progress on various fronts and much more.

so i agree with florian

I think this will never work, tbh :)

without a proper structural approach this simply won't work in the long terms

speaking of which @flohack74, I've been thinking the past few days that perhaps the majority of devices could be done by the maintainers as well......maybe

SkewedZeppelin commented 2 years ago

If booting was the only worry,

It is tested working on 17 devices, and reported working by users on 38 devices. That means all expected functions. I have even personally tested 911 calls on multiple of them.

we cannot have some user critical function breaking

Flip this around! What happens when one of your user's ends up with malware exploiting one of these issues and ends up with an empty bank account or their documents and photos held hostage by ransomware?

them already booting on mainline linux

But many of these actually cannot make calls.


Three of your flagship devices aren't even patched against wide-spread, well-known, covered in the media issues like DirtyCow along with no encryption support (#178 ) yet still write on the home page:

We are building a secure & private operating system for your smartphone

Kindly if you want to keep shipping these awful kernels please close this and do a writeup informing your users on the dire situation.

Fuseteam commented 2 years ago

It is tested working on 17 devices, and reported working by users on 38 devices. That means all expected functions. I have even personally tested 911 calls on multiple of them.

if you can reliably test and can get it tested, i would encourage you to document that if or when such PRs are submitted. If multiple users (ideally users that actually depend on the phone working as a phone) can attest it doesn't break anything there's nothing preventing us from merging such PRs, the main point is and always is, the few devs working on/with everything cannot and should not take on more work, we've already had someone (read mulitple people) burn out from all the stress and workload (including dalton, which you can see hasn't reacted)

Flip this around! What happens when one of your user's ends up with malware exploiting one of these issues and ends up with an empty bank account or their documents and photos held hostage by ransomware?

this is fair and the reason i revived this issue, my main point is ideally we have fully functional phones that do not regress due to automated patching

But many of these actually cannot make calls.

that's my point, it was a reaction to you saying

It boots on 70+ devices.

Booting is just one milestone, in fact it is the very first milestone you have to reach when porting, the real work start after you got it booting

Three of your flagship devices aren't even patched against wide-spread, well-known, covered in the media issues like DirtyCow along with no encryption support (https://github.com/ubports/ubuntu-touch/issues/178 )

honestly that flagship page should be updated. word of mouth says the current flagships are the pixel 3a, the vollaphone and uh what else 🤔 i don't think i name 5 hmmmm i would point to devices.ubuntu-touch.io but out of the top 5 the mi a2 has no maintainer afaict, and the BQs may not even be supported for long (tho that isn't known yet, we still don't know if they will make the cut in the long term) and that list might need to be cut down to about 45 (As those are installable with the ubports installer the easiest and recommended entry point we have for new users)

Kindly if you want to keep shipping these awful kernels please close this and do a writeup informing your users on the dire situation.

the situation is much more dire than just outdated channels if we're honest, we're currently working on updating from a 6 year old base to a 2 year old base, this is including on devices with newer kernels, that work can be seen here: https://gitlab.com/groups/ubports/development/core/-/boards

Long story short, if you are willing/able to submit pull requests against specific devices and enough users can attest on that pull request that the device is still fully functional, by all means please do. I'm sure those pull requests will be merged provided enough evidence that there are no regressions.

If you have a telegram account i would also like to invite you to join us in our porting group @ubports_porting, to help us convince and motivate the various maintainers and porters of various devices to also look into fixing and securing their kernel (there 30+ porters....very rough estimate.......and much less maintainers)

tldr; we need manpower, we ain't got enough and we're painfully aware of that but we ain't giving up.