PerformanC / ReZygisk

Transparent implementation of Zygisk.
GNU General Public License v3.0
516 stars 54 forks source link

[FR]: Re-write zygiskd to C99 #8

Closed ThePedroo closed 3 weeks ago

ThePedroo commented 5 months ago

Description

Currently zygiskd is in Rust, this FR is to track the conversion of it to C99.

Reason

Rust performs (far) worse than C in terms of efficiency, performance. Not only that, but adds unnecessary complexity trying to make the rustc happy, which ends up with a codebase that is harder to mantain.

C99 is a standard that will never change, and ensures it lasts forever, in the other hand, Rust is constantly changing, and requires us to follow up with its design.

Confirmations

Code of Conduct

chiteroman commented 4 months ago

C99? Why not C17 or C23?

ThePedroo commented 4 months ago

Well, since I'll be the only one working on it, I preffered to pick the standard I am most confortable with, which is C99 as I have been using it in all previous C projects

ThePedroo commented 4 months ago

Just an update on this, I've been actively working on it recently, and it's already almost done!

I'll be pushing into a branch soon! Soon make it fully working to get that reduction of 93% zygiskd weight.

aviallon commented 3 months ago

Just for the record, why does Rust perform far worse than C here? I honestly do not understand how there could be such a difference.

For the C standard though, I actually do think you should use C11 at least, as threading is undefined behaviour in C99 (it had no memory model).

ThePedroo commented 3 months ago

Zygiskd is a daemon that is daemon which is constantly used in devices that use Zygisk, everytime an app opens, it is notified and gives information about the app (if it's root, if must "umount"/is in denylist, etc), so it must be both lightweight (battery-wise) and swift, which Rust doesn't get near C unless we talk about making tons of micro optimizations and cost the safety of the software (use unsafe)

About threading, it's not that it's undefined behavior, it just doesn't exist in its standard library.

We could worry about that if we were talking about a software that runs on Windows (not so much, PerformanC CThreads project exists, allowing the use of threading with almost no overhead), which is not the case here, allowing us to use pthread flawlessly.

33KK commented 3 months ago

I really don't see how this is the case, Rust is NOT slower than C, usually other way around, due to allowing for better optimizations and using static linking. Even then, the difference in performance is more than negligible, checking if the app has root is not a heavy task and won't impact UX in any way if it's 1ms faster or slower. The "robust" argument is IMO just comical, the only thing C is robust at is creating massive amounts of memory management related vulnerabilities, race conditions and the like. Only real difference would be that binaries are larger, though it's under 1MB (edit: actually seems to be <350KB) in size at the moment, which is entirely a non-issue IMO. 350KB is negligible on a Java-based OS, both in terms of storage and RAM.

As for the complexity argument, the C codebase is already larger also:

u0_a509@localhost ~ [124]> wc -l ReZygiskC/zygiskd/**/*.c ReZygiskC/zygiskd/**/*.h
  125 ReZygiskC/zygiskd/src/companion.c
   58 ReZygiskC/zygiskd/src/dl.c
   95 ReZygiskC/zygiskd/src/main.c
  144 ReZygiskC/zygiskd/src/root_impl/apatch.c
   60 ReZygiskC/zygiskd/src/root_impl/common.c
   54 ReZygiskC/zygiskd/src/root_impl/kernelsu.c
  317 ReZygiskC/zygiskd/src/utils.c
  722 ReZygiskC/zygiskd/src/zygiskd.c
    6 ReZygiskC/zygiskd/src/companion.h
   57 ReZygiskC/zygiskd/src/constants.h
    6 ReZygiskC/zygiskd/src/dl.h
   14 ReZygiskC/zygiskd/src/root_impl/apatch.h
   23 ReZygiskC/zygiskd/src/root_impl/common.h
   14 ReZygiskC/zygiskd/src/root_impl/kernelsu.h
   43 ReZygiskC/zygiskd/src/utils.h
    6 ReZygiskC/zygiskd/src/zygiskd.h
 1744 total
u0_a509@localhost ~> wc -l ReZygisk/zygiskd/**/*.rs
   73 ReZygisk/zygiskd/src/companion.rs
   51 ReZygisk/zygiskd/src/constants.rs
   85 ReZygisk/zygiskd/src/dl.rs
   45 ReZygisk/zygiskd/src/main.rs
  141 ReZygisk/zygiskd/src/root_impl/apatch.rs
   77 ReZygisk/zygiskd/src/root_impl/kernelsu.rs
  124 ReZygisk/zygiskd/src/root_impl/magisk.rs
   78 ReZygisk/zygiskd/src/root_impl/mod.rs
  264 ReZygisk/zygiskd/src/utils.rs
  354 ReZygisk/zygiskd/src/zygiskd.rs
 1292 total

Rust code is much easier to understand, because the language and stdlib are much cleaner (no strcmp, strtok and all the other libc mess, safe concurrency, proper error handling, compile time checks), there's a real ecosystem of crates usually with great documentation rather than everyone reinventing the wheel and introducing new bugs together with it. I get it if you just prefer C, but your arguments in favor of using it just are unreasonable IMO. I'd personally prefer for as much as possible of Zygisk to be written in a memory safe language even if unsafe is used where necessary, as that still is an improvement over C in terms of security.

ThePedroo commented 3 months ago

The arguments generally used to say "Rust is faster than C" are never true. Rust has came since its early releases with numerous approaches of giving more information about variables, pointers, functions, which would allow the compiler to better optimize it.

However, when we think of C, 99% of those are also available on it, though not used. The FACT that C has been around for far more decades than Rust makes it a language that has well-matured optimizations and compilers (gcc, clang being the most famous)

Yep, I have seen a thread where people made a Rust program (really basic) faster than a C one, however:

While a program written in C, which no one focused in optimizing, was a hard task to beat in performance.

Yeah, the argument of memory safety can be true (the extensiveness of that truth is limited, see rust cve repository, with blazing fast CVEs written in 100% safe Rust), especially in this case which is hard/not possible to use Valgrind (for anything else, that argument is futile to be made in favor of not using C. We have current compiler checks for more basic issues, and static analyzers (gcc's -fanalyzer and clang analyzer too)), however you aren't really understanding how those can affect the program, and is just judging by "generic" consequences.

Memory leaks are a big issue when they happen when dealing with unsafe values, they allow the attacker to run untrusted code in the environment. However we are talking about zygiskd here, it's not a software that is displayed to untrusted party. Sure, libzygisk.so, injected in Zygote, which is then forked, communicates with it, but by the time the app really runs, the existence of zygiskd is unknown. (or else Zygisk would be easily detectable)

Sure, "not only memory leaks are memory bugs, buffer overflows exist too, among others" (No one said this, but let's imagine someone did). One of the main CVEs in C software are related to dynamic allocation of memory, e.g. use after free. I know pretty well of the dangers of such, and due to that, in ALL softwares in C I've written so far, I've reduced the amount of mallocs possible, to ensure we can reduce a great amount of security issues.

Also, robustness in C is not a lie or anything, it is real. No wonder why people are still able to use C99 or even C89 nowadays. Besides, its syntax and standards are robust which allows a C code to live a really long time.

Moving to the topic of higher complexity: Well, you aren't being fair here. The Rust zygiskd uses numerous libraries, which allows its codebase to be smaller (~+5 libraries and the amount of COL being so small compared to the entire size is kinda of ???), which in Zygiskd C there is no libraries.

"It is bigger anyway!" Sure, it is, but sorry to announce: The complexity of the syntax of C is far easier which allows better understanding of the codebase. It also will match when the rest of the ReZygisk project moves to C, which then won't have this difference of C++ and Rust or C and C++.

Also, I have to point out the benefits of using C here are beyond user-wise. Our CIs are ~40% faster in zygiskd C99 branch. That is futile for a lot of people, sure, but when we imagine our CI minutes are limited, and ReZygisk isn't the only project of the organization (and hopefully will get a lot of contributions), this WILL be an issue in the future.

I hope you understand the reasons behind such action. I'm not a newbie programmer, I've been here for a really long time, I know what I am doing. And a really friendly reminder: We're all in the same side, fighting for the same thing, aren't we? What matters is to achieve what we want: Awesome FOSS solutions to allow free use of our device without having to rely on proprietary programs (or modules). ❤️

(Sorry if there's any grammar error, typing this through my phone is a nightmare)

33KK commented 3 months ago

blazing fast CVEs

Most are non-critical, almost none are code execution. The use after free CVE list for 2024 took a while to scroll through tho. Good amount of buffer overflows and such too.

They needed an entire community to micro-optimize the code

That feels very cherry picked and irrelevant in the first place, zygiskd isn't doing much and whatever nanoseconds of difference Rust vs C may make in whichever direction really don't matter here. Rust is definitely not "far" worse than C in terms of either efficiency or performance, whatever benchmarks you're basing this opinion on are most likely misleading.

Also, I have to point out the benefits of using C here are beyond user-wise. Our CIs are ~40% faster in zygiskd C99 branch.

That is a fair point I guess, but does it matter that much? There are like 10 runs per week. GH Actions don't have any limits either.

sorry to announce: The complexity of the syntax of C is far easier

Sure, the language has less features, but that means nothing in terms of actual difficulty of understanding the code. Less language complexity does not necessarily translate into simpler code, often the opposite. libc APIs are more complex, unergonomic, easy to misuse than Rust std. Rust actually handles strings sanely, has proper error handling, pattern matching, and a lot of other things which add complexity to the language but make the code simpler.

Zygiskd C there is no libraries

Well of course, because using libraries in C is a pain, and half the libraries that are used, duplicate functionality for that same reason too. It's really not a benefit of C, if anything it's a negative. You can just not use crates for things that are easy enough to reimplement, like that a patch root_impl csv parser that carries serde and csv crates could just be a simple split and a pattern match. Also tokio+futures is in Cargo.toml but unused. Could probably get rid of a few more dependencies, but there's no good reason to do so.

We have current compiler checks for more basic issues, and static analyzers

Can only catch so much in a language never designed for static analysis.

Awesome FOSS solutions to allow free use of our device without having to rely on proprietary programs (or modules).

I totally would rather use an open source C implementation rather than a proprietary Rust one, but I'd also prefer for privileged code to not use languages that make it extremely easy to make major mistakes. I guess this isn't really that important when the injected code is C++, ideally that should be the most hardened part, but it's the biggest part of the code base, and Rust is way less similar to C than C++ is, plus there are some lsposed dependencies which also would need to be ported, so that seems like quite an effort and probably not something anyone is willing to do. Though the dependencies being C++ is still an issue when porting to C, so I guess it wouldn't even be that much easier.

ThePedroo commented 3 months ago

Most are non-critical, almost none are code execution. The use after free CVE list for 2024 took a while to scroll through tho. Good amount of buffer overflows and such too.

"Use after free, Buffer overflow, Segmentation fault". Well, I don't know about you, but for me all 3 are critical, especially the first 2.

zygiskd isn't doing much ??? Don`t forget that zygiskd runs in your device as long as it is turned on. Every app you open, and its forks, makes libzygisk.so send messages to it. It is more than you think if you come to see the logs.

Rust actually handles strings sanely That is because we are talking about static x dynamic strings. Also, for C, as long as the NULL terminator is there (which unless you explicitly don't, or is using an API that isn't necessarily for strings (char), won't set, but they're obvious), it is safe.

There are like 10 runs per week. GH Actions don't have any limits either. 10 runs per week, and that is because of me being busy with school work and other things that require local testing before commiting (and is extremely time-taking, for like, weeks, months)

Well of course, because using libraries in C is a pain Since when? FrequenC, uses numerous libraries, some by me, and others by third-parties, flawlessly. The crates made a scenario very similar to NPM/Node.js, which projects uses a billion libraries, which will often cause even more instability (not all libraries are good quality).

Can only catch so much in a language never designed for static analysis. You'd be surprised with the capabilities of -fanalyzer, for example. C was never meant for static analyzing, but what made C become so famous is its versatility.

ideally that should be the most hardened part, but it's the biggest part of the code base Not really, that's where I said judging with generic attacks here will end up with wrong statements.

The injected library doesn't have root powers, but it has a really powerful ability: Know its own handle and be able to access zygiskd. By that it allows to hook into things. If libzygisk.so is ever compromised (which is EXTREMELY unlikely), the only consequence will be a detection point. For this to really become a vulnerable spot, it will require zygiskd to also have a vulnerability, and this is even harder since its attack surface is REALLY small (as it only receives integers, strings and fds, which with few proper checks to avoid buffer overflows, it is safe), so it is safe to assume it will always remain safe.

Can assure you good effort will be made to ensure the safety of zygiskd, and I hope that other people can come ensure that ReZygisk remains safe.

33KK commented 3 months ago

zygiskd isn't doing much ??? Don`t forget that zygiskd runs in your device as long as it is turned on. Every app you open, and its forks, makes libzygisk.so send messages to it. It is more than you think if you come to see the logs.

I will not believe that there's any noticeable performance or efficiency difference without some actual evidence, it just makes no sense. We aren't talking about microcontrollers running at 100MHz with 32kb RAM and ROM here, it's just not going to be noticeable in any scenario on a phone, that is assuming the C version even is more efficient or whatever, in reality it's probably more or less identical.

"Use after free, Buffer overflow, Segmentation fault". Well, I don't know about you, but for me all 3 are critical, especially the first 2.

? There's barely any of those on cve.mitre.org for Rust, those that exist are pretty old. I meant that there are a lot of those for C/C++.

reveny commented 3 months ago

zygiskd isn't doing much ??? Don`t forget that zygiskd runs in your device as long as it is turned on. Every app you open, and its forks, makes libzygisk.so send messages to it. It is more than you think if you come to see the logs.

I will not believe that there's any noticeable performance or efficiency difference without some actual evidence, it just makes no sense. We aren't talking about microcontrollers running at 100MHz with 32kb RAM and ROM here, it's just not going to be noticeable in any scenario on a phone, that is assuming the C version even is more efficient or whatever, in reality it's probably more or less identical.

"Use after free, Buffer overflow, Segmentation fault". Well, I don't know about you, but for me all 3 are critical, especially the first 2.

? There's barely any of those on cve.mitre.org for Rust, those that exist are pretty old. I meant that there are a lot of those for C/C++.

TLDR; This repo is using c99 because everyone hates rust and it makes it very difficult to maintain, at least my opinion.

kevin01523 commented 2 months ago

https://www.youtube.com/watch?v=WiPp9YEBV0Q&t=1529s https://www.youtube.com/watch?v=-w4H2-LLVdY