android / ndk

The Android Native Development Kit
1.99k stars 257 forks source link

debug rcdailey's zlib unwind issues #457

Closed DanAlbert closed 7 years ago

DanAlbert commented 7 years ago

@rcdailey

Forking from https://github.com/android-ndk/ndk/issues/230#issuecomment-315478458

Not entirely clear what's going on just yet. It certainly looks like the unwinders are crossing the streams, but that shouldn't be happening with anything built with a modern NDK.

Could you check and make sure that all the _Unwind_* symbols in your libraries are hidden?

rcdailey commented 7 years ago

What indicates if those symbols are "hidden"? I can grep all my libs for that keyword... but not sure what to look for besides that.

DanAlbert commented 7 years ago

readelf -sW yourlib.so | grep _Unwind will tell you. readelf is included in the NDK under toolchains/arm-linux-androideabi-4.9/prebuilt/$OS/bin/arm-linux-androideabi-readelf (it's actually a cross-arch executable, there's just one per binutils install) in case you don't have it or are on Windows or something.

Here are some examples of what various symbols should look like:

extern "C" {
__attribute__((visibility("hidden"))) void myfunc_hidden() {}
static void myfunc_static() {}
extern void myfunc_extern();
void myfunc_public() { myfunc_extern(); }
}
$ readelf -sW foo/libs/armeabi-v7a/libfoo.so | grep myfunc_
     3: 000006ab     4 FUNC    GLOBAL DEFAULT   11 myfunc_public
     4: 00000000     0 NOTYPE  GLOBAL DEFAULT  UND myfunc_extern

The column that says DEFAULT might also say HIDDEN, or the column that says GLOBAL might also say LOCAL. HIDDEN, LOCAL, or just not at all present are all forms of hidden.

alexcohn commented 7 years ago

@rcdailey Please check: your APK should not contain libzlib.so packed in any form. NDK build will use the system version of the library. The binary that is packed with NDK is intended to be used as a link stub only, I doubt if it has ever been tested.

Now, if the system version is bad (which happens on private ROMs), there is an easy workaround. You can add the official sources to your project and use your version of zlib. The size of the library is quite small, and it uses no private APIs that could make the system version better.

Actually, there are no reasons why zlib should be packaged in NDK, except historical.

rcdailey commented 7 years ago

@alexcohn I am in fact packaging libz.so into my APK. I copy it from the NDK. It just makes sense to me to package the exact binary I linked against. I'll try without it though...

Do I still need to load the libz.so library in Java even if I don't package it?

EDIT: I also package libc++_shared.so with my APK (I copy that from the NDK as well). I assume that this is is still required? I don't recall seeing STL libraries on the device.

alexcohn commented 7 years ago

Well, this pretty much explains the problem that you encountered. Just like you don't package libc.so or lbm.so with your APK, libz.so is part of the public NDK libraries list.

I agree that the distinction between libz.so and libc++_shared.so is not clear for users of NDK. In a parallel thread I have even suggested to provide a mechanism of sharing libc++_shared.so across applications. It's only under 1 MB, but still the benefit will also be that any fixes (including security fixes) will be available to everybody.

But as of today, you should package libc++_shared.so and should not package libz.so.

stephenhines commented 7 years ago

@alexcohn: There really is no way to "share" libc++_shared.so though. It is intentionally meant to be unstable, so that the implementation of it can be improved over time (and also ABI bugs fixed, etc.). The real trouble here is that C++ was never intended to be used for dynamic shared libraries. In the C++ model, everything is meant to be compiled at the same time, with the same components, rather than having a path to do partial upgrades. It is great that partial updates can work for the vast majority of developer scenarios, but upgrades of the STL really will never be possible based on the way that the C++ standard works today.

rcdailey commented 7 years ago

Thanks @alexcohn. Unfortunately I won't be able to retest my scenario until Monday when I'm in the office. However I'll let you know the results then!

alexcohn commented 7 years ago

@stephenhines: Did I ever say sharing libc++ is easy? But never say never…

rcdailey commented 7 years ago

@DanAlbert: I ran the readelf command and here is the output:

$ "E:\android\ndk_72\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64\bin\arm-linux-androideabi-readelf.exe" -sW libzApp.so | grep _Unwind
     4: 00000000     0 FUNC    GLOBAL DEFAULT  UND _Unwind_Resume
2596963: 00000000     0 FUNC    GLOBAL DEFAULT  UND _Unwind_Resume

There are 2 and neither are hidden based on your description. If this is problematic, what is the next step? Should I run readelf on my static libs too or something, to narrow it down to a specific third party library?

rcdailey commented 7 years ago

Also I removed libz.so from my APK and I'm loading it from /system/lib now on device, but I still get the segfault:

*** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
Build fingerprint: 'Android/ziosk/ziosk:4.2.2/JDQ39E/dev.bsp.BSP-6.3.3.14.1706091617:eng/test-keys'
Revision: '6'
pid: 4945, tid: 4958, name: ttm.zPayService  >>> com.ttm.zPayService <<<
signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 00000016
    r0 00000002  r1 5ee488d4  r2 00000000  r3 00000000
    r4 00000002  r5 5ee4892c  r6 5bd7104c  r7 5ee48a00
    r8 5bd71050  r9 5bd71054  sl 5bd71060  fp 00000000
    ip 5ec31a94  sp 5ee488c0  lr 401ea89c  pc 401ea604  cpsr 80000050
    d0  0000000000000000  d1  0000000000000000
    d2  0000000000000000  d3  0000000000000000
    d4  6620535043206568  d5  656d79617020726f
    d6  000000127420746e  d7  40322189374bc6a8
    d8  0000000000000000  d9  0000000000000000
    d10 0000000000000000  d11 0000000000000000
    d12 0000000000000000  d13 0000000000000000
    d14 0000000000000000  d15 0000000000000000
    d16 697a2f617461642f  d17 746e6f632f6b736f
    d18 6977206e69676562  d19 6873616c73206874
    d20 746f6e20646e6120  d21 74697720646e6520
    d22 49202e656e6f2068  d23 646e657070612073
    d24 0067006700670067  d25 0067006700670067
    d26 0067006700670067  d27 0067006700670067
    d28 0100010001000100  d29 0100010001000100
    d30 0000000100000001  d31 0000000100000001
    scr 60000010

backtrace:
    #00  pc 00010604  /system/lib/libz.so (__gnu_Unwind_Resume+8)
    #01  pc 00010898  /system/lib/libz.so (_Unwind_Resume+20)

stack:
         5ee48880  5ee48c34  
         5ee48884  00000000  
         5ee48888  400588c8  
         5ee4888c  5ee488e4  
         5ee48890  5ee488c0  
         5ee48894  5e96fa2f  /data/app-lib/com.ttm.zapp-1/libzPayService.so (boost::shared_ptr<boost::filesystem::filesystem_error::m_imp>::shared_ptr<boost::filesystem::filesystem_error::m_imp>(boost::filesystem::filesystem_error::m_imp*)+54)
         5ee48898  5ee48c34  
         5ee4889c  5ee48c34  
         5ee488a0  5ee48c34  
         5ee488a4  59de38d8  
         5ee488a8  5ee48c34  
         5ee488ac  5ee48c34  
         5ee488b0  5ee48c34  
         5ee488b4  5ee48970  
         5ee488b8  df0027ad  
         5ee488bc  00000000  
    #00  5ee488c0  00000000  
         5ee488c4  40058850  
         5ee488c8  5ee4892c  
         5ee488cc  401ea89c  /system/lib/libz.so (_Unwind_Resume+24)
    #01  5ee488d0  00000000  
         5ee488d4  00000000  
         5ee488d8  00000002  
         5ee488dc  5ee48970  
         5ee488e0  00000000  
         5ee488e4  5e96b80b  /data/app-lib/com.ttm.zapp-1/libzPayService.so ((anonymous namespace)::error(int, boost::filesystem::path const&, boost::system::error_code*, char const*)+222)
         5ee488e8  40058850  
         5ee488ec  5ee4892c  
         5ee488f0  5bd7104c  
         5ee488f4  5ee48a00  
         5ee488f8  5bd71050  
         5ee488fc  5bd71054  
         5ee48900  5bd71060  
         5ee48904  00000000  
         5ee48908  5ec31a94  /data/app-lib/com.ttm.zapp-1/libzPayService.so
         5ee4890c  5ee48918  

It's consistently coming from boost::filesystem but not sure why... never had this problem when I was using GNU STL + Clang.

rcdailey commented 7 years ago

So I did a grep of all my prebuilt *.a and *.so files for armeabi-v7a. The last time these were built it was with GNU STL + GCC, I am currently using LLVM STL + Clang. I did not rebuild libraries that did not yield linker errors, since I assumed that would catch any ABI issues. But maybe there's more to it?

Some libs that are showing _Unwind_Resume in their binary data resulting from the grep:

If they show up here, would it cause problems? Do I need to rebuild them even if there are no linker issues?

alexcohn commented 7 years ago

Maybe start with the filesystem_error you are receiving from boost? What is the root cause of it?

DanAlbert commented 7 years ago

I am in fact packaging libz.so into my APK. I copy it from the NDK. It just makes sense to me to package the exact binary I linked against.

Just like you don't package libc.so or lbm.so with your APK, libz.so is part of the public NDK libraries list.

Exactly. The libz.so in the NDK is just a stub. Every function in it is void foo() {}. We maintain ABI compatibility for NDK libraries in the system, which is why its safe to do this.

EDIT: I also package libc++_shared.so with my APK (I copy that from the NDK as well). I assume that this is is still required? I don't recall seeing STL libraries on the device.

Yeah, this is correct. The C++ STLs are not ABI stable, so we ship real libraries in the NDK for you to package in your app.

I agree that the distinction between libz.so and libc++_shared.so is not clear for users of NDK.

You can determine which category a library falls into based on its location in the NDK. Anything in platforms/ or sysroot/ is a stub. If it's outside that directory (STLs are in sources/cxx-stl/...), it's a real library and should be shipped with your app.

I think I'm going to be rewriting the C++ Libraries doc soon to account for the fact that we're advocating for people to switch to libc++ starting with r16. I'll make sure I mention this.

So I did a grep of all my prebuilt .a and .so files for armeabi-v7a. The last time these were built it was with GNU STL + GCC, I am currently using LLVM STL + Clang.

This is definitely part of the problem. You can't reliably mix STLs in the same app (there are some ways to do it that aren't exactly correct if you're very careful, but it's best to just avoid it). Should also be noted that you'll need to use the shared version of the STL (you mention above that you're using c++_shared, so you're fine, but just a note that you shouldn't switch to c++_static). This is the case whenever you have more than one shared library (the actual conditions are a bit more complex, but that's a good rule of thumb).

Any chance those libraries were also built with an old version of the NDK? There were definitely some problems with the way we linked the unwinder until r12 or r13.

If rebuilding the world still has those unwind symbols left undefined or public, then you're probably being hit by https://github.com/android-ndk/ndk/issues/379.

I did not rebuild libraries that did not yield linker errors, since I assumed that would catch any ABI issues. But maybe there's more to it?

The linker can't catch everything, sadly.

struct mystruct {
    std::string s;
}

void foo(const mystruct&);

If something like the above is used in a library, everything will still link fine even if the two std::string have different mangled names. If the library was built with gnustl and the caller was built with libc++, the std::string will have different layout at each end of the call, and this will lead to some very strange bugs.

rcdailey commented 7 years ago

Thanks for all the great feedback so far. I think I'm going to head down the road of just making sure all our third party libs build together with my normal targets. However, as I'm working towards that goal, I run into new issues each time...

I have ImageMagick building with r15b using the same toolchain settings as my normal targets, and loading that shared lib now says it can't find "floor":

D ZActivity: Activity onCreate
D TTMApplication: Loading library: c
D dalvikvm: No JNI_OnLoad found in /system/lib/libc.so 0x41714cf8, skipping init
D TTMApplication: Loading library: c++_shared
D dalvikvm: Trying to load lib /data/app-lib/com.ttm.zapp-1/libc++_shared.so 0x41714cf8
D dalvikvm: Added shared lib /data/app-lib/com.ttm.zapp-1/libc++_shared.so 0x41714cf8
D dalvikvm: No JNI_OnLoad found in /data/app-lib/com.ttm.zapp-1/libc++_shared.so 0x41714cf8, skipping init
D TTMApplication: Loading library: z
D dalvikvm: No JNI_OnLoad found in /system/lib/libz.so 0x41714cf8, skipping init
D TTMApplication: Loading library: MagickCore
D dalvikvm: Trying to load lib /data/app-lib/com.ttm.zapp-1/libMagickCore.so 0x41714cf8
E dalvikvm: dlopen("/data/app-lib/com.ttm.zapp-1/libMagickCore.so") failed: Cannot load library: soinfo_relocate(linker.cpp:975): cannot locate symbol "floor" referenced by "libMagickCore.so"...
D AndroidRuntime: Shutting down VM
W dalvikvm: threadid=1: thread exiting with uncaught exception (group=0x4120f930)
E AndroidRuntime: FATAL EXCEPTION: main
E AndroidRuntime: java.lang.UnsatisfiedLinkError: Cannot load library: soinfo_relocate(linker.cpp:975): cannot locate symbol "floor" referenced by "libMagickCore.so"...

I tried explicitly doing loadLibrary("c") in Java to make libc.so load (not sure if this is necessary; does libc.so get loaded automatically somewhere?) but that didn't fix the issue with it not finding floor.

Any reason for this? Floor is provided by the C library so it should be finding it... I don't understand.

enh commented 7 years ago

floor (like most of ) is provided by libm, not libc.

rcdailey commented 7 years ago

Ok so that means I do need to explicitly load libc and libm (which I assume is also under /system/lib). No standard system libraries seem to be loaded automatically for me, and I must do it through java?

rcdailey commented 7 years ago

Looks like loading libm doesn't fix it...

D TTMApplication: Loading library: c
D dalvikvm: No JNI_OnLoad found in /system/lib/libc.so 0x41710888, skipping init
D TTMApplication: Loading library: m
D dalvikvm: No JNI_OnLoad found in /system/lib/libm.so 0x41710888, skipping init
D TTMApplication: Loading library: c++_shared
D dalvikvm: Trying to load lib /data/app-lib/com.ttm.zapp-2/libc++_shared.so 0x41710888
D dalvikvm: Added shared lib /data/app-lib/com.ttm.zapp-2/libc++_shared.so 0x41710888
D dalvikvm: No JNI_OnLoad found in /data/app-lib/com.ttm.zapp-2/libc++_shared.so 0x41710888, skipping init
D TTMApplication: Loading library: z
D dalvikvm: No JNI_OnLoad found in /system/lib/libz.so 0x41710888, skipping init
D TTMApplication: Loading library: MagickCore
D dalvikvm: Trying to load lib /data/app-lib/com.ttm.zapp-2/libMagickCore.so 0x41710888
E dalvikvm: dlopen("/data/app-lib/com.ttm.zapp-2/libMagickCore.so") failed: Cannot load library: soinfo_relocate(linker.cpp:975): cannot locate symbol "floor" referenced by "libMagickCore.so"...
D AndroidRuntime: Shutting down VM
W dalvikvm: threadid=1: thread exiting with uncaught exception (group=0x4120f930)
E AndroidRuntime: FATAL EXCEPTION: main
E AndroidRuntime: java.lang.UnsatisfiedLinkError: Cannot load library: soinfo_relocate(linker.cpp:975): cannot locate symbol "floor" referenced by "libMagickCore.so"...
enh commented 7 years ago

both libc and libm will already have been loaded by the zygote. what OS release and architecture is this on? did you pull the libm off the device and check it actually has a floor symbol?

rcdailey commented 7 years ago

@enh

$ "E:\android\android-ndk-r15b\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64\bin\arm-linux-androideabi-readelf.exe" -sW libm.so | grep floor
    26: 0000e358   352 FUNC    GLOBAL DEFAULT    7 floor
    80: 0000e4b8   196 FUNC    GLOBAL DEFAULT    7 floorf
   119: 0000e580   488 FUNC    GLOBAL DEFAULT    7 floorl
DanAlbert commented 7 years ago

That's not the real libm though. adb pull /system/lib/libm.so and check that one.

rcdailey commented 7 years ago

@DanAlbert That's what I did, I used filezilla to copy it over but it's the same one. Sorry for the confusion.

enh commented 7 years ago

what version of Android is this?

rcdailey commented 7 years ago

Jellybean API 17 is the actual device OS. I setup my NDK minimum to API 15 though

DanAlbert commented 7 years ago

My bad, I actually read that wrong (I just saw the long path into the NDK, but that was for readelf).

Could you readelf -sW libMagickCore.so | grep -w floor? I'm wondering if you're linking against a more modern libm and there's some symbol versioning stuff going on.

rcdailey commented 7 years ago

@DanAlbert

$ "E:\android\android-ndk-r15b\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64\bin\arm-linux-androideabi-readelf.exe" -sW libMagickCore.so | grep floor
   115: 00000000     0 NOTYPE  GLOBAL DEFAULT  UND floor
183616: 00000000     0 NOTYPE  GLOBAL DEFAULT  UND floor

(sorry forgot the -w to grep; but ran it again with it and the results are the same)

DanAlbert commented 7 years ago

So much for that theory.

DanAlbert commented 7 years ago

Might give https://github.com/KeepSafe/ReLinker a try just to rule out any linker weirdness.

rcdailey commented 7 years ago

I'll take a look at that. In the meantime here is one compile & the final *.so link command line output when I build with ninja -v. Not sure if it will help but it's something...

[325/326] E:\android\ndk_72\toolchains\llvm\prebuilt\windows-x86_64\bin\clang.exe --target=armv7-none-linux-androideabi --gcc-toolchain=E:/android/ndk_72/toolchains/arm-linux-androideabi-4.9/prebuilt/windows-x86_64 --sysroot=E:/android/ndk_72/sysroot -DANDROID -DMAGICKCORE_HDRI_ENABLE=0 -DMAGICKCORE_QUANTUM_DEPTH=8 -DMagickCore_EXPORTS -IE:/android/ndk_72/sources/android/cpufeatures -IE:/android/ndk_72/sources/android/native_app_glue -IE:/code/frontend/source/Core/ThirdParty/ImageMagick/source/jni/ImageMagick-7.0.5-2 -IE:/code/frontend/source/Core/ThirdParty/ImageMagick/source/jni/tiff-3.9.7/libtiff -IE:/code/frontend/source/Core/ThirdParty/ImageMagick/source/jni/jpeg-9b -IE:/code/frontend/source/Core/ThirdParty/libpng/source/jni -isystem E:/android/ndk_72/sysroot/usr/include -isystem E:/android/ndk_72/sysroot/usr/include/arm-linux-androideabi -march=armv7-a -mthumb -mfpu=vfpv3-d16 -mfloat-abi=softfp -funwind-tables -no-canonical-prefixes -D__ANDROID_API__=15 -fexceptions -O2 -g -DNDEBUG -fPIC   -Wno-inconsistent-missing-override -Wno-expansion-to-defined -MD -MT Core/ThirdParty/ImageMagick/source/CMakeFiles/MagickCore.dir/jni/ImageMagick-7.0.5-2/MagickCore/quantum-import.c.o -MF Core\ThirdParty\ImageMagick\source\CMakeFiles\MagickCore.dir\jni\ImageMagick-7.0.5-2\MagickCore\quantum-import.c.o.d -o Core/ThirdParty/ImageMagick/source/CMakeFiles/MagickCore.dir/jni/ImageMagick-7.0.5-2/MagickCore/quantum-import.c.o   -c E:/code/frontend/source/Core/ThirdParty/ImageMagick/source/jni/ImageMagick-7.0.5-2/MagickCore/quantum-import.c
[326/326] cmd.exe /C "cd . && E:\android\ndk_72\toolchains\llvm\prebuilt\windows-x86_64\bin\clang.exe --target=armv7-none-linux-androideabi --gcc-toolchain=E:/android/ndk_72/toolchains/arm-linux-androideabi-4.9/prebuilt/windows-x86_64 --sysroot=E:/android/ndk_72/platforms/android-15/arch-arm -fPIC -march=armv7-a -mthumb -mfpu=vfpv3-d16 -mfloat-abi=softfp -funwind-tables -no-canonical-prefixes -D__ANDROID_API__=15 -fexceptions -O2 -g -DNDEBUG  -Wl,--fix-cortex-a8 -u ANativeActivity_onCreate -shared -Wl,-soname,libMagickCore.so -o output\bin\libMagickCore.so @CMakeFiles/MagickCore.rsp  && cd ."
enh commented 7 years ago

seems like you're not actually asking to link against libm there. is libm.so in the DT_NEEDEDs for your .so?i don't know whether that makes any difference to the JB dynamic linker, but it's worth checking. readelf -aW | grep NEEDED

rcdailey commented 7 years ago

@enh This is what I get:

$ "E:\android\android-ndk-r15b\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64\bin\arm-linux-androideabi-readelf.exe" -aW libMagickCore.so | grep NEEDED
 0x00000001 (NEEDED)                     Shared library: [libz.so]
 0x00000001 (NEEDED)                     Shared library: [libdl.so]
 0x00000001 (NEEDED)                     Shared library: [libc.so]

Seems libm is missing. I'm not even sure how libc is being added; I don't do anything explicitly to link against standard libraries. This must be a CMake-controlled thing. Thoughts? Is this a CMake issue or something?

EDIT: I just realized I am missing -lm on the command line... I have that for my other targets but not third party, I think thats the problem.

DanAlbert commented 7 years ago

Looks like it's not building with -Wl,--no-undefined either, which would have caught this at build time. How are you building ImageMagick? Is it with our cmake toolchain, a standalone toolchain, ndk-build, or something else?

rcdailey commented 7 years ago

I'm using CMake 3.9.0-rc5 with a toolchain file:

set( CMAKE_SYSTEM_NAME Android )
set( CMAKE_SYSTEM_VERSION 15 ) # API level
set( CMAKE_ANDROID_ARCH_ABI armeabi-v7a )
set( CMAKE_ANDROID_STL_TYPE c++_shared )
set( CMAKE_ANDROID_NDK_TOOLCHAIN_VERSION clang )
DanAlbert commented 7 years ago

It looks like (based on the command line above, I haven't actually checked the source) the upstream CMake stuff adds neither -Wl,--no-undefined nor -lm by default. The one we ship in the NDK does both of these things.

rcdailey commented 7 years ago

You're right; I had to enable it by doing this explicitly in my CMake script:

set( CMAKE_SHARED_LINKER_FLAGS "${CMAKE_SHARED_LINKER_FLAGS} --no-undefined" )

And now my command line looks like this:

cmd.exe /C "cd . && E:\android\ndk_72\toolchains\llvm\prebuilt\windows-x86_64\bin\clang.exe --target=armv7-none-linux-androideabi --gcc-toolchain=E:/android/ndk_72/toolchains/arm-linux-androideabi-4.9/prebuilt/windows-x86_64 --sysroot=E:/android/ndk_72/platforms/android-15/arch-arm -fPIC -march=armv7-a -mthumb -mfpu=vfpv3-d16 -mfloat-abi=softfp -funwind-tables -no-canonical-prefixes -D__ANDROID_API__=15 -fexceptions -O2 -g -DNDEBUG  -Wl,--fix-cortex-a8 -u ANativeActivity_onCreate --no-undefined -shared -Wl,-soname,libMagickCore.so -o output\bin\libMagickCore.so @CMakeFiles/MagickCore.rsp  && cd ."
DanAlbert commented 7 years ago

You want -Wl,--no-undefined. That flag needs to be passed to ld (at least, I think it does, maybe clang is passing it along for you these days).

With that, does libMagickCore.so fail to build? I would suspect it does if you're not linking libm.

rcdailey commented 7 years ago

Yes, I get failures with just --no-undefined (what does the -W1 part do?). I'll update the command line per your suggestion though (assuming it's OK to have multiple -W1 appearing).

Sample of the failure:

E:/code/frontend/source/Core/ThirdParty/ImageMagick/source/jni/ImageMagick-7.0.5-2/MagickCore/fx.c:2653: error: undefined reference to 'j0'
E:/code/frontend/source/Core/ThirdParty/ImageMagick/source/jni/ImageMagick-7.0.5-2/MagickCore/fx.c:2864: error: undefined reference to 'tan'
E:/code/frontend/source/Core/ThirdParty/ImageMagick/source/jni/ImageMagick-7.0.5-2/MagickCore/fx.c:2396: error: undefined reference to 'acos'

EDIT: I just realized that's an "L" not a "one" after the W...

DanAlbert commented 7 years ago

-Wl (that's a lower case L, btw, not the number one) just tells clang to pass the argument to ld. If the errors show up without it then I suppose it isn't necessary any more though, so never mind.

Yeah, those errors (tan and acos, anyway, idk what the j0 thing is) are what I'd expect. Add -lm to the ldflags as well and then those should be fixed (as well as your floor issue).

rcdailey commented 7 years ago

Yep, the linker errors are gone and I'm back to the unwind issue again. Looks like I need to keep rebuilding libs until I see it hidden in my *.so files. I can manage that on my own... if I've literally built everything and I'm still getting it, I'll let you know.

There are definitely libraries we haven't rebuilt since NDK r10, maybe earlier. Mostly because they do not depend on anything in STL (as far as direct inclusions go). OpenSSL for example. Would be nice to isolate exactly which library is causing the unwind problem though...

Also thanks for the --no-undefined help, that will assist in finding these linker problems better in the future. I'm also going to adopt ReLinker since that seems superior to the built in one.

alexcohn commented 7 years ago

You definitely want to rebuild OpenSSL after 3 years. There have been some important bug fixes.

rcdailey commented 7 years ago

@DanAlbert When you use --no-undefined, do you ever get this when using native app glue?

sources/android/native_app_glue/android_native_app_glue.c:233: error: undefined reference to 'android_main'

We use JNI_OnLoad as our entry point I believe; not sure what android_main is for.

alexcohn commented 7 years ago

You miss (in ndk-buildish)

$(call import-module,android/native_app_glue)

I am not sure how this is expected to be expressed in cmakeian.

rcdailey commented 7 years ago

Looking at the Android.mk file, looks like I'm doing everything I should. Specifically, I link against android (via -landroid). Output from console:

[703/1476] Linking CXX shared library output\bin\libPrintServer.so
FAILED: output/bin/libPrintServer.so
cmd.exe /C "cd . && E:\android\ndk_72\toolchains\llvm\prebuilt\windows-x86_64\bin\clang++.exe --target=armv7-none-linux-androideabi --gcc-toolchain=E:/android/ndk_72/toolchains/arm-linux-androideabi-4.9/prebuilt/windows-x86_64 --sysroot=E:/android/ndk_72/platforms/android-15/arch-arm -fPIC -march=armv7-a -mthumb -mfpu=vfpv3-d16 -mfloat-abi=softfp -funwind-tables -no-canonical-prefixes -D__ANDROID_API__=15 -fexceptions -frtti -O2 -g -DNDEBUG  -Wl,--fix-cortex-a8 -u ANativeActivity_onCreate -Wl,--no-undefined -shared -Wl,-soname,libPrintServer.so -o output\bin\libPrintServer.so Applications/PrintServer/CMakeFiles/PrintServer.dir/Source/PrinterHAL.cpp.o Applications/PrintServer/CMakeFiles/PrintServer.dir/Source/com_ttm_PrintService.cpp.o  -landroid output/lib/libcpufeatures.a output/lib/libnative_app_glue.a -ljnigraphics -lm -llog -lEGL -lGLESv2 -ldl -landroid  "E:/android/ndk_72/sources/cxx-stl/llvm-libc++/libs/armeabi-v7a/libc++_shared.so" "E:/android/ndk_72/sources/cxx-stl/llvm-libc++/libs/armeabi-v7a/libandroid_support.a" && cd ."
E:/android/ndk_72/sources/android/native_app_glue/android_native_app_glue.c:233: error: undefined reference to 'android_main'
clang++.exe: error: linker command failed with exit code 1 (use -v to see invocation)
[709/1476] Linking C shared library output\bin\libMagickCore.so
alexcohn commented 7 years ago

See the Native Activity sample. You must supply this

void android_main(struct android_app* state)

Maybe you have it in a CPP file, and forgot to unmangle it with extern "C".

rcdailey commented 7 years ago

That was the issue, thank you. Normally our common libs provide this, but this particular target did not depend on it so I assumed it was already defined.

enh commented 7 years ago

(again "what does readelf show?" is a useful sanity check whenever you have these kinds of issues... it would have shown you that you didn't have "android_main", you had a C++-mangled function name instead.)

rcdailey commented 7 years ago

@DanAlbert So I officially have all of my third party libs building in CMake in real time with my targets. There is nothing else left over that I'm linking that is pre-built. Should be no traces of GNU left. I still see _Unwind_Resume not hidden:

$ "E:\android\android-ndk-r15b\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64\bin\arm-linux-androideabi-readelf.exe" -sW libzApp.so | grep _Unwind
     9: 00000000     0 FUNC    GLOBAL DEFAULT  UND _Unwind_Resume
2504819: 00000000     0 FUNC    GLOBAL DEFAULT  UND _Unwind_Resume

And my build command below (result of ninja -v):

[3/4] E:\android\ndk_72\toolchains\llvm\prebuilt\windows-x86_64\bin\clang++.exe --target=armv7-none-linux-androideabi --gcc-toolchain=E:/android/ndk_72/toolchains/arm-linux-androideabi-4.9/prebuilt/windows-x86_64 --sysroot=E:/android/ndk_72/sysroot  -DANDROID -DBETTER_ENUMS_STRICT_CONVERSION -DBOOST_ALL_NO_LIB=1 -DBOOST_ASIO_DISABLE_THREAD_KEYWORD_EXTENSION -DBOOST_BIND_NO_PLACEHOLDERS -DBOOST_FILESYSTEM_NO_DEPRECATED -DBOOST_SYSTEM_NO_DEPRECATED -DBOOST_THREAD_PROVIDES_EXECUTORS -DBOOST_THREAD_USES_CHRONO -DBOOST_THREAD_VERSION=4 -DBUILD_OGLES2 -DMAGICKCORE_HDRI_ENABLE=0 -DMAGICKCORE_QUANTUM_DEPTH=8 -DNOAUTOLINK_MAGICK -DOPENSSL_NO_ASM -DSTATIC_MAGICK -DZIOSK_ENABLE_ZPAY_DIAGNOSTICS -DZIOSK_MODULE_NAME=\"zApp\" -D_MAGICKLIB_ -DzApp_EXPORTS -IE:/code/frontend/source/Applications/zApp/Source -IE:/code/frontend/source/Core/UI/Source -ICore/UI/Source -IE:/code/frontend/source/Core/ThirdParty/PowerVR/sdk/Include -IE:/code/frontend/source/Core/ThirdParty/PowerVR/tools -IE:/code/frontend/source/Core/ThirdParty/PowerVR/tools/OGLES2 -isystem Core/ThirdParty/boost/source/boost/boost_1_64_0 -IE:/code/frontend/source/Core/ThirdParty/openssl/source/include -ICore/ThirdParty/openssl/source/include -IE:/code/frontend/source/Core/ThirdParty/sqlite/source -IE:/code/frontend/source/Core/ThirdParty/cereal/include -IE:/code/frontend/source/Core/ThirdParty/rapidxml/include -IE:/code/frontend/source/Core/ThirdParty/better-enums/include -IE:/code/frontend/source/Core/ThirdParty/libpng/source/jni -IE:/code/frontend/source/Core/ThirdParty/ImageMagick/source/jni/ImageMagick-7.0.5-2 -IE:/code/frontend/source/Core/ThirdParty/bsp/msr/include -IE:/android/ndk_72/sources/android/cpufeatures -IE:/android/ndk_72/sources/android/native_app_glue -IE:/code/frontend/source/Core/Barcode/Source -IE:/code/frontend/source/Core/ThirdParty/zxing/source/core/src -IE:/code/frontend/source/Applications/DynamicUI/Source -IE:/code/frontend/source/Core/WebServices/Source -IE:/code/frontend/source/Applications/OrderEntry/Source -IE:/code/frontend/source/Services/Source -IE:/code/frontend/source/Applications/zPayService/Interface/Source -IE:/code/frontend/source/Applications/PATT/Source -IE:/code/frontend/source/Applications/Loyalty/Source -IE:/code/frontend/source/Applications/Survey/Source -IE:/code/frontend/source/Applications/EmailClub/Source -IE:/code/frontend/source/Applications/SettingsManager/Source -IE:/code/frontend/source/Applications/MessagingModule/Source -IE:/code/frontend/source/Applications/ETM/Source -isystem E:/android/ndk_72/sources/cxx-stl/llvm-libc++/include -isystem E:/android/ndk_72/sources/android/support/include -isystem E:/android/ndk_72/sources/cxx-stl/llvm-libc++abi/include -isystem E:/android/ndk_72/sysroot/usr/include -isystem E:/android/ndk_72/sysroot/usr/include/arm-linux-androideabi -march=armv7-a -mthumb -mfpu=vfpv3-d16 -mfloat-abi=softfp -funwind-tables -no-canonical-prefixes -D__ANDROID_API__=15 -fexceptions -frtti -O2 -g -DNDEBUG -fPIC   -Wno-inconsistent-missing-override -Wno-expansion-to-defined -std=gnu++14 -MD -MT Applications/zApp/CMakeFiles/zApp.dir/Source/ZioskApp.cpp.o -MF Applications\zApp\CMakeFiles\zApp.dir\Source\ZioskApp.cpp.o.d -o Applications/zApp/CMakeFiles/zApp.dir/Source/ZioskApp.cpp.o -c E:/code/frontend/source/Applications/zApp/Source/ZioskApp.cpp
[4/4] cmd.exe /C "cd . && E:\android\ndk_72\toolchains\llvm\prebuilt\windows-x86_64\bin\clang++.exe --target=armv7-none-linux-androideabi --gcc-toolchain=E:/android/ndk_72/toolchains/arm-linux-androideabi-4.9/prebuilt/windows-x86_64 --sysroot=E:/android/ndk_72/platforms/android-15/arch-arm -fPIC -march=armv7-a -mthumb -mfpu=vfpv3-d16 -mfloat-abi=softfp -funwind-tables -no-canonical-prefixes -D__ANDROID_API__=15 -fexceptions -frtti -O2 -g -DNDEBUG  -Wl,--fix-cortex-a8 -u ANativeActivity_onCreate -Wl,--no-undefined -shared -Wl,-soname,libzApp.so -o output\bin\libzApp.so Applications/zApp/CMakeFiles/zApp.dir/Source/ZioskApp.cpp.o
output/lib/libUI.a output/lib/libBarcode.a output/lib/libDynamicUI.a output/lib/libOrderEntry.a output/lib/libPATT.a output/lib/libServices.a output/lib/libSurvey.a output/lib/libEmailClub.a output/lib/libSettingsManager.a output/lib/libLoyalty.a output/lib/libMessagingModule.a output/lib/libETM.a -landroid output/lib/libcpufeatures.a output/lib/libnative_app_glue.a -ljnigraphics -lm -llog -lEGL -lGLESv2 output/lib/libDynamicUI.a output/lib/libPATT.a output/lib/libLoyalty.a output/lib/libDynamicUI.a output/lib/libPATT.a output/lib/libLoyalty.a output/lib/libBarcode.a output/lib/libzxing.a output/lib/libOrderEntry.a output/lib/libServices.a output/lib/libzPayServiceInterface.a output/lib/libServices.a output/lib/libzPayServiceInterface.a output/lib/libWebServices.a output/lib/libUI.a output/lib/libPowerVR.a output/lib/libboost_context.a output/lib/libboost_date_time.a output/lib/libboost_filesystem.a output/lib/libboost_regex.a output/lib/libboost_signals.a output/lib/libboost_thread.a output/lib/libboost_chrono.a output/lib/libboost_system.a output/lib/libssl.a output/lib/libcrypto.a output/lib/libsqlite.a output/lib/libpng.a -lz output/bin/libMagickWand.so output/bin/libMagickCore.so output/lib/libcpufeatures.a -ldl output/lib/libnative_app_glue.a -landroid -ljnigraphics -lm -llog -lEGL -lGLESv2  "E:/android/ndk_72/sources/cxx-stl/llvm-libc++/libs/armeabi-v7a/libc++_shared.so" "E:/android/ndk_72/sources/cxx-stl/llvm-libc++/libs/armeabi-v7a/libandroid_support.a" && cd ."

Note that [3/4] is a build of a CPP file, and [4/4] is the link of libzApp.so.

I'm out of ideas...

DanAlbert commented 7 years ago

You're still using the upstream CMake support, right? I don't see any mention of -Wl,--exclude-libs,libgcc.a or -Wl,--exclude-libs,libunwind.a. Without this, the following (which is probably what happens by default under CMake):

$ clang++ foo.o -lbar -lgcc -o libfoo.so -shared

Will result in libfoo.so having undefined references to things in libgcc (like the unwind symbols) because the linker thinks it can get them from libbar.

IMO, just switch to our cmake toolchain file. These sorts of problems are basically the whole reason we have our own.

rcdailey commented 7 years ago

Does your toolchain file use CMake's modern Android NDK integration features (as documented here)? I want to avoid using "ghetto" toolchain files like takanome's, which we had to use in the "old days".

DanAlbert commented 7 years ago

Ours basically is take-no-me's, but given that your options seem to be "ghetto" and "broken", "ghetto" seems like a good choice.

We're working on integrating ours with the modern CMake approach, but the modern approach didn't exist when we created it and these things take time.

rcdailey commented 7 years ago

Sorry I didn't mean my comment to come off as rude or a complaint. What I'm trying to say is that I'd rather upstream CMake do everything your toolchain does. Brad has constantly told me that the design intent for toolchain files in CMake has been to be very simple things that do not do any system introspection. Takanome's is a violation of that, and a symptom of a larger problem: Neeidng better built in support for NDK.

Sure, I do want my stuff working but another goal of mine is to help contribute these to upstream CMake so that maybe eventually the NDK won't need to package a toolchain file. In the future, it would be great if the NDK developers could work with the CMake devs to help contribute these issues. Dan you have a lot of valuable knowledge that I have not been able to find on my own. And it's a huge waste of your expertise and time to have to answer the same questions over and over again (either by dealing with people like me, or having to code them explicitly into a toolchain file).

I'm willing to help (even if there's not too much I can do besides facilitate communication), but we really need to get that knowledge out of your brain & the toolchain file bundled with the NDK and get it into upstream CMake. Unless I'm misunderstanding some separation of concerns here, that seems to be the ideal long term solution.