onekey-sec / unblob

Extract files from any kind of container formats
https://unblob.org
Other
2.14k stars 81 forks source link

Fix #724. build: Add darwin aarch64 to supported architectures #743

Closed ljrk0 closed 7 months ago

ljrk0 commented 7 months ago

It seems like #741 only added M1 to the CI build but Nix still claimed the architecture as unsupported (cf. #724). This adds macOS on ARM to the Nix supported architectures (confirmed locally).

qkaiser commented 7 months ago

Good catch ! That's why it was so easy to add support for M1 in #741 , the build did not run 😒

There's an issue when running the integration test suite on Apple M1. It's related to simg2img handling Android sparse images. The following exception can be seen in the logs:

unblob-tests> 2024-02-04 13:35.41 [debug ] Running extract command command=simg2img /private/tmp/nix-build-unblob-tests-24.1.22.drv-0/source/tests/integration/filesystem/android/sparse/input/fruits.sparse /private/tmp/nix-build-unblob-tests-24.1.22.drv-0/pytest-of-_nixbld1/pytest-0/test_all_handlers_filesystem_a0/fruits.sparse_extract/raw.image pid=40602 unblob-tests> 2024-02-04 13:35.41 [error ] Extract command failed command=simg2img /private/tmp/nix-build-unblob-tests-24.1.22.drv-0/source/tests/integration/filesystem/android/sparse/input/fruits.sparse /private/tmp/nix-build-unblob-tests-24.1.22.drv-0/pytest-of-_nixbld1/pytest-0/test_all_handlers_filesystem_a0/fruits.sparse_extract/raw.image exit_code=0xff pid=40602 severity=<Severity.WARNING: 'WARNING'> stderr=Cannot write output file

I'll have a look but if you have any information that can help us get it fixed let us know :)

qkaiser commented 7 months ago

@ljrk0 all tests are passing on your host ? just wanna know if the problem is coming from simg2img or something wrong in the runner.

ljrk0 commented 7 months ago

Hey, @qkaiser ah, well, that explains it I guess :D

I just reproduced locally with the same problem:

$ unblob -vvv __input__/fruits.sparse
2024-02-04 14:14.18 [debug    ] Logging configured             extract_root=. pid=72624 vebosity_level=3
2024-02-04 14:14.18 [info     ] Start processing file          file=__input__/fruits.sparse pid=72624
2024-02-04 14:14.18 [debug    ] Setting up signal handlers     original_signal_handlers={<Signals.SIGINT: 2>: <built-in function default_int_handler>, <Signals.SIGTERM: 15>: <Handlers.SIG_DFL: 0>} pid=72624
2024-02-04 14:14.18 [debug    ] Detected file-magic            magic=Android sparse image, version: 1.0, Total of 128 4096-byte output blocks in 7 input chunks.\012- data path=__input__/fruits.sparse pid=72625
2024-02-04 14:14.18 [debug    ] Processing file                path=__input__/fruits.sparse pid=72625 size=0xa07c
2024-02-04 14:14.18 [debug    ] Calculating chunk for pattern match handler=sparse pid=72625 real_offset=0x0 start_offset=0x0
2024-02-04 14:14.18 [debug    ] Header parsed                  header=
00000000  3a ff 26 ed 01 00 00 00  1c 00 0c 00 00 10 00 00   :.&.............
00000010  80 00 00 00 07 00 00 00  00 00 00 00               ............

struct sparse_header:
- magic: 0xed26ff3a
- major_version: 0x1
- minor_version: 0x0
- file_hdr_sz: 0x1c
- chunk_hdr_sz: 0xc
- blk_sz: 0x1000
- total_blks: 0x80
- total_chunks: 0x7
- image_checksum: 0x0 pid=72625
2024-02-04 14:14.18 [debug    ] Found valid chunk              chunk=0x0-0xa07c handler=sparse pid=72625
2024-02-04 14:14.18 [debug    ] Ended searching for chunks     all_chunks=[0x0-0xa07c] pid=72625
2024-02-04 14:14.18 [debug    ] Removed inner chunks           outer_chunk_count=1 pid=72625 removed_inner_chunk_count=0
2024-02-04 14:14.18 [debug    ] Running extract command        command=simg2img /Users/janis.koenig/Documents/Development/unblob/tests/integration/filesystem/android/sparse/__input__/fruits.sparse /Users/janis.koenig/Documents/Development/unblob/tests/integration/filesystem/android/sparse/__input__/fruits.sparse_extract/raw.image pid=72625
2024-02-04 14:14.18 [error    ] Extract command failed         command=simg2img /Users/janis.koenig/Documents/Development/unblob/tests/integration/filesystem/android/sparse/__input__/fruits.sparse /Users/janis.koenig/Documents/Development/unblob/tests/integration/filesystem/android/sparse/__input__/fruits.sparse_extract/raw.image exit_code=0xff pid=72625 severity=<Severity.WARNING: 'WARNING'> stderr=Cannot write output file
 stdout=
2024-02-04 14:14.18 [debug    ] Processing directory           path=__input__/fruits.sparse_extract pid=72626
2024-02-04 14:14.18 [debug    ] Detected file-magic            magic=data path=__input__/fruits.sparse_extract/raw.image pid=72627
2024-02-04 14:14.18 [debug    ] Processing file                path=__input__/fruits.sparse_extract/raw.image pid=72627 size=0x80000
2024-02-04 14:14.18 [debug    ] Ended searching for chunks     all_chunks=[] pid=72627
ljrk0 commented 7 months ago

I could debug the issue, it's related to mmap requiring page-aligned offsets and macOS on AArch64 using 16KiB pages instead of 4KiB, sparse implicitly assumes the latter though. I'd argue one can skip this test for now while I work with upstream :)

ljrk0 commented 7 months ago

... if I had scrolled through the open PRs on simg2img instead of only the issues, I'd have found the fix there already: https://github.com/anestisb/android-simg2img/pull/38

I can confirm using getpagesize() or through sysconf() fixes the issue.

qkaiser commented 7 months ago

@ljrk0 thanks for looking into this, I was just about to launch an M1 machine on Scaleway. I would rather fix simg2img locally until it's fixed upstream than skip the test. You can do it this way:

diff --git a/overlay.nix b/overlay.nix
index 52e7f81..9e1f71c 100644
--- a/overlay.nix
+++ b/overlay.nix
@@ -12,6 +12,15 @@ final: prev:
       nativeCheckInputs = (super.nativeCheckInputs or [ ]) ++ [ final.which ];
     });

+  simg2img = prev.simg2img.overrideAttrs (super: {
+    postPatch = ''
+      substituteInPlace output_file.cpp \
+        --replace-fail \
+        'aligned_offset = offset & ~(4096 - 1);' \
+        'aligned_offset = offset & ~(sysconf(_SC_PAGESIZE) - 1);'
+    '';
+  });
+
   # Own package updated independently of nixpkgs
   jefferson = final.callPackage ./nix/jefferson { };
ljrk0 commented 7 months ago

Awesome -- I still have to learn a lot about Nix :)

I just wanted to test this locally, but Nix now complains that I have it installed already:

error: An existing package already provides the following file:

         /nix/store/2kx2fjh2d9bgnp80f6iz3j240kgl8bcg-python3.11-unblob-24.1.22/bin/unblob

       This is the conflicting file from the new package:

         /nix/store/mxxxc8ssv7l71xgpnaf7gs8xzdrqk9ws-python3.11-unblob-24.1.22/bin/unblob

       To remove the existing package:

         nix profile remove git+file:///Users/janis.koenig/Documents/Development/unblob#packages.aarch64-darwin.default

Unfortunately the suggested command fails:

$ nix profile remove git+file:///Users/janis.koenig/Documents/Development/unblob#packages.aarch64-darwin.default
warning: 'git+file:///Users/janis.koenig/Documents/Development/unblob#packages.aarch64-darwin.default' does not match any packages
warning: Use 'nix profile list' to see the current profile.

And the profile list being:

$ nix profile list
Index:              0
Flake attribute:    packages.aarch64-darwin.default
Original flake URL: git+file:///Users/janis.koenig/Documents/Development/unblob
Locked flake URL:   git+file:///Users/janis.koenig/Documents/Development/unblob
Store paths:        /nix/store/2kx2fjh2d9bgnp80f6iz3j240kgl8bcg-python3.11-unblob-24.1.22

Index:              1
Store paths:        /nix/store/ij6fap3skb3fjdfc6ry1whddj85ll9v6-home-manager-path

a bit at loss here but meanwhile I update the PR and we can see what the CI says!

ljrk0 commented 7 months ago

At least this test seems to run as expected now:

$ unblob -vvv __input__/fruits.sparse
2024-02-04 15:51.00 [debug    ] Logging configured             extract_root=. pid=81179 vebosity_level=3
2024-02-04 15:51.00 [info     ] Start processing file          file=__input__/fruits.sparse pid=81179
2024-02-04 15:51.00 [debug    ] Setting up signal handlers     original_signal_handlers={<Signals.SIGINT: 2>: <built-in function default_int_handler>, <Signals.SIGTERM: 15>: <Handlers.SIG_DFL: 0>} pid=81179
2024-02-04 15:51.00 [debug    ] Detected file-magic            magic=Android sparse image, version: 1.0, Total of 128 4096-byte output blocks in 7 input chunks.\012- data path=__input__/fruits.sparse pid=81180
2024-02-04 15:51.00 [debug    ] Processing file                path=__input__/fruits.sparse pid=81180 size=0xa07c
2024-02-04 15:51.00 [debug    ] Calculating chunk for pattern match handler=sparse pid=81180 real_offset=0x0 start_offset=0x0
2024-02-04 15:51.00 [debug    ] Header parsed                  header=
00000000  3a ff 26 ed 01 00 00 00  1c 00 0c 00 00 10 00 00   :.&.............
00000010  80 00 00 00 07 00 00 00  00 00 00 00               ............

struct sparse_header:
- magic: 0xed26ff3a
- major_version: 0x1
- minor_version: 0x0
- file_hdr_sz: 0x1c
- chunk_hdr_sz: 0xc
- blk_sz: 0x1000
- total_blks: 0x80
- total_chunks: 0x7
- image_checksum: 0x0 pid=81180
2024-02-04 15:51.00 [debug    ] Found valid chunk              chunk=0x0-0xa07c handler=sparse pid=81180
2024-02-04 15:51.00 [debug    ] Ended searching for chunks     all_chunks=[0x0-0xa07c] pid=81180
2024-02-04 15:51.00 [debug    ] Removed inner chunks           outer_chunk_count=1 pid=81180 removed_inner_chunk_count=0
2024-02-04 15:51.00 [debug    ] Running extract command        command=simg2img /Users/janis.koenig/Documents/Development/unblob/tests/integration/filesystem/android/sparse/__input__/fruits.sparse /Users/janis.koenig/Documents/Development/unblob/tests/integration/filesystem/android/sparse/__input__/fruits.sparse_extract/raw.image pid=81180
2024-02-04 15:51.00 [debug    ] Processing directory           path=__input__/fruits.sparse_extract pid=81181
2024-02-04 15:51.00 [debug    ] Detected file-magic            magic=data path=__input__/fruits.sparse_extract/raw.image pid=81182
2024-02-04 15:51.00 [debug    ] Processing file                path=__input__/fruits.sparse_extract/raw.image pid=81182 size=0x80000
2024-02-04 15:51.00 [debug    ] Calculating chunk for pattern match handler=lzma pid=81182 real_offset=0x9a77 start_offset=0x9a77
2024-02-04 15:51.00 [debug    ] File format is invalid         handler=lzma pid=81182
Traceback (most recent call last):
  File "/nix/store/mxxxc8ssv7l71xgpnaf7gs8xzdrqk9ws-python3.11-unblob-24.1.22/lib/python3.11/site-packages/unblob/finder.py", line 35, in _calculate_chunk
    return handler.calculate_chunk(file, real_offset)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nix/store/mxxxc8ssv7l71xgpnaf7gs8xzdrqk9ws-python3.11-unblob-24.1.22/lib/python3.11/site-packages/unblob/handlers/compression/lzma.py", line 70, in calculate_chunk
    raise InvalidInputFormat
unblob.file_utils.InvalidInputFormat
2024-02-04 15:51.00 [debug    ] Ended searching for chunks     all_chunks=[] pid=81182
$ shasum __input__/fruits.sparse_extract/raw.image 
324598091c0b4844adba53d8d259ee42e978985c  __input__/fruits.sparse_extract/raw.image
$ shasum __output__/fruits.sparse_extract/raw.image 
324598091c0b4844adba53d8d259ee42e978985c  __output__/fruits.sparse_extract/raw.image

Let's see whether the CI swallows it too :)