animetosho / rapidyenc

SIMD accelerated yEnc en/decode C library
5 stars 0 forks source link

Mac build issues #1

Closed mnightingale closed 1 year ago

mnightingale commented 1 year ago

I'm having a couple of issues on macOS Sonoma 14.0 on a M2 that hoping you can help with.

First what seems a minor issue with:

https://github.com/animetosho/rapidyenc/blob/78d71c448b86729c21fc116c8d1b51920f8230b7/CMakeLists.txt#L113

In file included from /Users/mnightingale/personal/workspace/rapidyenc/rapidyenc/src/platform.cc:11:
In file included from /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.0.sdk/usr/include/sys/sysctl.h:83:
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.0.sdk/usr/include/sys/ucred.h:101:2: error: unknown type name 'u_int'
        u_int   cr_version;             /* structure layout version */
...
(snip same for a few other files)

Removing the line seems to resolve it, at least it builds but I don't know if there are any implications.


Second issue with Xcode 15

Since updating to Xcode 15 (Apple clang version 15.0.0 (clang-1500.0.40.1)) I crash when trying to call rapidyenc_decode_incremental from Go.

SIGILL: illegal instruction
PC=0x103cc5000 m=0 sigcode=2
signal arrived during cgo execution
instruction bytes: 0x53 0xd9 0x3 0x4f 0x34 0xe7 0x1 0x4f 0x95 0xe4 0x0 0x4f 0x16 0xe4 0x2 0x4f

I'm not sure but I don't think those instructions are arm code?

Taking a look at the dylib in ghidra:

Error | Bad Instruction | Unable to resolve constructor at 00005000 (flow from 00004ffc) | 00005000 |   | ?? 53h    S

There is a chunk from 0x5000 - 0x56e0 which is hasn't decompiled.

Looking at the disassembled code preceeding it which is function do_decode_simd<>(uchar **param_1,uchar **param_2,ulong param_3,YencDecoderState *param_4)

image

If I comment out the neon64 decode it works correctly, no bad instructions.

#if(IS_ARM64)
#   set(DECODER_NEON_FILE decoder_neon64.cc)
#else()
    set(DECODER_NEON_FILE decoder_neon.cc)
#endif()

At this point I'm kind of lost, I thought maybe a missing compiler flag? or maybe something has changed with Apples new compiler.

Thanks, Mike

animetosho commented 1 year ago

Thanks for reporting.

Implemented a fix for the first issue.

The second issue is weird. Could you get the call stack when it raises the SIGILL?
Just shooting in the dark, could you try replacing _vld1q_u8_x4 on this line with vld4q_u8, and seeing if it still SIGILLs in the same place?

mnightingale commented 1 year ago

I've had another look at this, building the shared library with -DCMAKE_CXX_FLAGS_RELEASE=-O1 worked so I determined it must be a compiler/optimisation error. -O2 some clang error with linker -O3 I think is the default, compiles but crashes / bad instructions

Since I started having this problem 'Command Line Tools for Xcode 15.1 beta' was released and it's working again :)

Thanks for your help.

animetosho commented 1 year ago

Thanks for the investigation.
Is the original Xcode 15 a beta? It's interesting to note that it's using Clang 15.0.0 - I've understood Clang's x.0.0 designation to signify in-dev code (first stable release is typically x.0.1). I'm generally willing to make workarounds for "stable" release compiler bugs.

mnightingale commented 1 year ago

Xcode 15 is a stable release and the problem remained on 15.0.1. Unfortunately I'd updated macOS so 15 is the minimum I can use, luckily I had a backup of the working shared library which I'd been using for the last few weeks.

For my use in Go I'm in the process of switching to a use a static library, which doesn't appear to have the problem when compiled with 15.0.1

Shared library build of 9c6f0b9 built with Xcode 15.0.1 and Xcode 15.1 Beta librapidyenc.zip

Built with:

rm -rf rapidyenc/build
cmake -S rapidyenc -B rapidyenc/build
cmake --build rapidyenc/build --target rapidyenc_shared -j8

Maybe you'll find the difference interesting but I wouldn't worry about a workaround for it.

animetosho commented 1 year ago

That comparison was interesting. Initially thought it'd be useless, but it seems to be the same compiler in both, so generates almost identical code:

-    5000:  4f03d953    .inst   0x4f03d953 ; undefined
+    5000:  4f00e553    movi    v19.16b, #0xa
-    5400:  3dc00008    ldr q8, [x0]
+    5400:  3dc3d808    ldr q8, [x0, #3936]

Given that the address is always a multiple of 1KB, and only some bits were changed, I'm thinking this is more a linker bug.
So yeah, not much I can do about it unfortunately.

Thanks for the help!