open-obfuscator / o-mvll

:electron: O-MVLL is a LLVM-based obfuscator for native code (Android & iOS)
https://obfuscator.re/omvll
Apache License 2.0
574 stars 62 forks source link

Clang frontend rework #22

Closed weliveindetail closed 12 months ago

weliveindetail commented 12 months ago

Invoke host clang to compile C++ to IR instead of shipping a frontend in the plugin

There are multiple of motivations for this change: (1) Less code and complexity of the plugin: binary size from 70M down to 30M (2) Avoid IR version difference between host clang and shipped frontend (3) Host clang configuration is easier to match, i.e. we can just pass the same arguments

Points (2) and (3) guarantee compatibility between injected IR and the host IR module. This is critical because LLVM IR gives no such guarantees between different toolchain versions or forks. We can live with that when we target Android, because we know and we can reproduce the exact target compiler version from Goolge's android toolchain repository.

iOS is a different story, because our host compiler (the official AppleClang) is known to carry a surprising amount of internal modifications on top of its public upstream. While we don't know the actual patches, we can measure that the difference is huge by comparing the ABI of the two binaries. This suggests that our previously shipped clang frontend from apple/llvm-project may generate different IR than the host compiler. Failures would be arbitrary and there is no good way to detect them. It could be errors from middle-end passes, backend crashes, silent miscompilation or it may also just work. We wouldn't know.

weliveindetail commented 12 months ago

There is some polishing todo and I have to fix a test failure on Linux. However, what do you think about the approach in general?

romainthomas commented 12 months ago

I love this design idea! (I didn't look at all the changes though)

antoniofrighetto commented 12 months ago

Minor: the changes in commit Polishing after Clang frontend rework related to StringEncoding can be squashed into the previous commit, IMHO we don't need to have another commit for fixes of newly-added changes in the PR.

weliveindetail commented 12 months ago

Minor: the changes in commit Polishing after Clang frontend rework related to StringEncoding can be squashed into the previous commit, IMHO we don't need to have another commit for fixes of newly-added changes in the PR.

Yes and no. I'd like to keep the patch with the actual change as small as possible. If we ever end up here when bisecting for a regression, the patch we have to understand is smaller. And that's good.

That said, I didn't spend the time to sort out my polishing into formatting fixes and code reduction. The commit with the actual change made the code simpler in a few spots and I used to second commit to remove some code. It's probably not worth the effort to sort this out. Otherwise I would agree with you request.

weliveindetail commented 12 months ago

There is some polishing todo and I have to fix a test failure on Linux. However, what do you think about the approach in general?

In fact, this uncovered an issue in my proposed environment for regression tests that is under review in https://github.com/open-obfuscator/o-mvll/pull/15. The test passes for me locally and the difference to the output from the build bot above is this:

--- test-dumps-local.log    2023-07-12 11:45:23
+++ test-dumps-github.log   2023-07-12 11:45:23
@@ -1,6 +1,6 @@
 Running tests with: /test-deps/bin/clang
 Testing plugin file: /o-mvll/src/build_ndk_r25/libOMVLL.so
-Available features are: {'system-linux', 'shell', 'host-platform-linux', 'host-arch-x86', 'aarch64-registered-target', 'bpf-registered-target', 'x86-registered-target', 'arm-registered-target'}
+Available features are: {'system-linux', 'shell', 'host-platform-linux', 'host-arch-x86', 'target=None', 'aarch64-registered-target', 'x86-registered-target'}
 OMVLL_PYTHONPATH: /Python-3.10.7/Lib
 -- Testing: 9 tests, 2 workers --
 PASS: O-MVLL Tests :: core/plugin/find-omvll-config-cwd.c (1 of 9)
@@ -11,7 +11,21 @@
 PASS: O-MVLL Tests :: passes/arithmetic/xor-x86_64-darwin.c (6 of 9)
 PASS: O-MVLL Tests :: passes/flattening/basic-native.c (7 of 9)
 PASS: O-MVLL Tests :: passes/flattening/basic-aarch64.c (8 of 9)
-PASS: O-MVLL Tests :: passes/strings-encoding/basic-aarch64.cpp (9 of 9)
+...
++ : 'RUN: at line 22'
++ env OMVLL_CONFIG=/o-mvll/src/test/passes/strings-encoding/config_replace.py /test-deps/bin/clang++ -fpass-plugin=/o-mvll/src/build_ndk_r25/libOMVLL.so -target aarch64-linux-android -fno-legacy-pass-manager -O1 -c /o-mvll/src/test/passes/strings-encoding/basic-aarch64.cpp -o /dev/null
+/test-deps/bin/clang -S -emit-llvm -std=c++17 -o /tmp/lit-tmp-gtl5vm7b/omvll-x86_64-unknown-linux-gnu-0e7f2a.cpp.ll /tmp/lit-tmp-gtl5vm7b/omvll-x86_64-unknown-linux-gnu-0e7f2a.cpp
+/test-deps/bin/clang -S -emit-llvm -target aarch64-unknown-linux-android -std=c++17 -o /tmp/lit-tmp-gtl5vm7b/omvll-aarch64-unknown-linux-android-4cc414.cpp.ll /tmp/lit-tmp-gtl5vm7b/omvll-aarch64-unknown-linux-android-4cc414.cpp
+No available targets are compatible with triple "x86_64-unknown-linux-gnu"

-Testing Time: 0.37s
-  Passed: 9
\ No newline at end of file
+--
+
+********************
+********************
+Failed Tests (1):
+  O-MVLL Tests :: passes/strings-encoding/basic-aarch64.cpp
+
+
+Testing Time: 0.92s
+  Passed: 8
+  Failed: 1

I will make a note in the respective PR. It's unrelated to this review.