Guidance on universal "fat" binary dependencies on Mac platforms

conan-io / conan

Conan - The open-source C and C++ package manager

https://conan.io

MIT License

8.12k stars 965 forks source link

Guidance on universal "fat" binary dependencies on Mac platforms #16745

Open cjserio opened 1 month ago

cjserio commented 1 month ago

What is your question?

Hi! I've seen a lot written about this in various channels but the dates are old and they usually are open-ended without a definite resolution.

What's the best guidance to-date when dealing with multi-gen environments like Xcode? Our project currently depends on universal/"fat" binaries. I've been building separate platforms and "lipo"ing them together by hand for years. It's one of the main reasons I'd like to switch to something like Conan.

I've run 'conan install' for each architecture and end up with binaries and cmake output for each architecture. Great!

But the problem is that while Xcode can handle setting linker options "per architecture" (so that one could specifiy to use libfoo_arm64.a for arm64 builds and libfoo_x86_64.a for x86_64 builds), cmake's target_link_libraries does not seem to know HOW to do that or accept any architecture as an input.

So it seems like I either need a way to convince CMake to setup the linker properly with each binary for its respective target archtecture, or I need a way for conan to 'lipo' the binaries together AND update the CMakeDeps output to point to that new fat binary instead of the single-arch cached one.

I've seen some custom-commands around here that do a lipo after a full_deploy and that seemed promising but cmake isn't included in on the joke and therefore the project would have to be manually configured to use the binaries. I'm hoping for a more end to end solution than that.

Have you read the CONTRIBUTING guide?

[ ] I've read the CONTRIBUTING guide

memsharded commented 1 month ago

Hi @cjserio

Thanks for your question.

I probably don't know enough about fat libraries and Mac ecosystem, so first some clarifications, then I might need help from someone else.

But the problem is that while Xcode can handle setting linker options "per architecture" (so that one could specifiy to use libfoo_arm64.a for arm64 builds and libfoo_x86_64.a for x86_64 builds), cmake's target_link_libraries does not seem to know HOW to do that or accept any architecture as an input.

Conan can specify different tools.build:cxxflags and these flags can be defined at the conan install level, with different values for each Conan install. In Conan CMakeToolchain integrations, it is possible to define different output folder and CMake output presets based on the tools.cmake.cmaketoolchain:build_folder_vars = ["settings.arch", ...] for example.

Different architectures are still different CMake configured projects, so it wouldn't be an issue to have them side by side with different linker options, and if using presets, then selecting the right preset including the architecture will automatically handle that.

Does this makes sense to you? If not, could you please elaborate a bit more about the linker options, are there other options to set? Or is it just the libraries to link?

cjserio commented 1 month ago

We're not using CMakeToolchain, only CMakeDeps. The Xcode project has multiple targets across multiple configurations (debug, release, debug with some optimization etc etc) across multiple architectures. It's in Xcode that we pick which combinations of each axis we need for the moment. So cmake needs to have everything laid out in advance to properly form the Xcode project.

As far as how Xcode linker settings work. It seems that cmake's target_link_libraries works by simply placing the full path to a static library in the "Other Linker Flags" setting in Xcode. That setting exists per-build-type but also per-architecture. If you set it at a high level, the setting applies to ALL variants underneath it. But you could optionally set it for specific settings. Here's an example:

// This applies to ALL build_types and archs. OTHER_LDFLAGS = (/path/to/libfoo.a)

// This overrides the generic one above and specializes the setting for arm64 only "OTHER_LDFLAGS[arch=arm64]" = (/path/to/libfoo_arm64.a);

// This overrides the generic one above and specializes the setting for x86_64 only "OTHER_LDFLAGS[arch=x86_64]" = (/path/to/libfoo_x86_64.a);

So now when Xcode decides to build the application, it'll pick the proper library at the proper time. So Xcode is capable of this but cmake doesn't seem to know how to automatically do that with the data from target_link_libraries. Cmake seems to only ever touch "OTHER_LDFLAGS" directly.

I think that means I cannot have separate static libraries by architecture. I need them to be "fat" and joined together so that I can say:

OTHER_LDFLAGS = (/path/to/libfoo_fat_arm64_and_x86_64.a)

But that means that conan needs to both lipo the two separate binaries together AND CMakeDeps needs to point to the proper path of the lipo'ed binary.

memsharded commented 1 month ago

So now when Xcode decides to build the application, it'll pick the proper library at the proper time. So Xcode is capable of this but cmake doesn't seem to know how to automatically do that with the data from target_link_libraries. Cmake seems to only ever touch "OTHER_LDFLAGS" directly.

Yes, up to my knowledge, this is a CMake limitation, it can only generate projects multi-configuration in the configuration axis, that is for different Debug/Release/... configurations, but not for different architectures by default.

Have you considered using the XcodeDeps generator? https://docs.conan.io/2/reference/tools/apple/xcodedeps.html

I am still not sure about the overall setup, because it seems you have Xcode projects with Xcode configuration, but also using CMake?

I think that means I cannot have separate static libraries by architecture. I need them to be "fat" and joined together so that I can say:

As always in computer science, everything is a trade-off. What is exactly the issue, developer convenience, so they can use the Xcode IDE, switching the builds for different architectures from the IDE? I mean in the worst case, following the classic CMake behavior, what developers often do (independently of using Conan or not) is having different cmake configure steps for different architectures, this is in fact the standard approach in Linux development.

Conan has some modeling for multi-architecture binaries, via the -s="arch=armv8|x86_64" syntax, but while CMake might be able to implement and create multi-architecture packages, most of other build systems, including tons of dependencies from ConanCenter using autotools, cannot create multi-architecture binaries. This is an intrinsic limitation of the build system, it would require extraordinary hacking to be able to create multi-architecture binaries in ConanCenter. Then, expecting that many of the most open source third party libraries, that even if using CMake will transitively depend on other libraries built with autotools, means that it can be almost impossible to get Conan packages with universal binaries in Mac. So I am trying to understand a bit better what are the constraints and the setup to try to help.

cjserio commented 1 month ago

Hi! Thanks for your reply. I've not considered the XcodeDeps generator. We have one cmake for the entire application Windows/Mac/Linux and another for Android/iOS and I'd like to keep the logic together and not have it split up by platform/IDE. It's another reason we're only using CMakeDeps. We just want to sneak Conan binaries into the existing process without disturbing the process much.

I recognize that the way Xcode works is the exception, not the rule.

Xcode's setup this way because in Apple's ecosystem, this is the way to build universal apps. Our application is incredibly complex and when you add in codesigning, frameworks, dependencies, entitlements and everything else involved, once it works properly you back away slowly and try not to ever mess with it again! :)

For a production build, the IDE knows how to build all relevant architectures and then it combines them together into a fat binary and codesigns it. Any dependencies you provide it need to either be broken out by architecture or already fat.

As I said, cmake doesn't seem to know how to break the dependencies up by architecture so that means I need the binaries from Conan to be fat. I don't expect Conan to do this automatically. I don't expect ConanCenter to have them prebuilt. I expect it to be a post-process. I already see a lipo deploy command that works well but as I said CMakeDeps isn't in on the joke so i end up with a pile of binaries but none of the "find_library" support that CMakeDeps provides. The only hacking I think I would need is to have CMakeDeps point to the fat binaries generated in a temporary path instead of in the Conan caches.

Maybe you're not familiar with the lipo command? It's a single command on mac that takes multiple binaries of varying architectures and combines them into a single fat binary. So the build process doesn't need to change to get universal binaries, it just needs to be a post-processing step.

As annoying as this all is, this is the way the Apple ecosystem works and until x86_64 goes away, we're stuck having to support multiple architectures. And once it does I'm sure apple will switch chipsets again anyway. :)

memsharded commented 1 month ago

As I said, cmake doesn't seem to know how to break the dependencies up by architecture so that means I need the binaries from Conan to be fat. I don't expect Conan to do this automatically. I don't expect ConanCenter to have them prebuilt. I expect it to be a post-process. I already see a lipo deploy command that works well but as I said CMakeDeps isn't in on the joke so i end up with a pile of binaries but none of the "find_library" support that CMakeDeps provides. The only hacking I think I would need is to have CMakeDeps point to the fat binaries generated in a temporary path instead of in the Conan caches.

The approach I commented above with -s="arch=armv8|x86_64" is pretty experimental, and exclusively for CMake, but it allows to create Conan packages that have a fat binary inside. I think this test here is worth having a look: https://github.com/conan-io/conan/blob/develop2/test/functional/toolchains/cmake/test_universal_binaries.py

The Conan integrations CMakeToolchain is able to handle that and create fat binaries.

Maybe you're not familiar with the lipo command? It's a single command on mac that takes multiple binaries of varying architectures and combines them into a single fat binary. So the build process doesn't need to change to get universal binaries, it just needs to be a post-processing step.

Yes, I am not an expert, but I know how it works. My concerns above was is not that much on the consumer side, but on dependencies. If you are going to manage all dependencies yourself, having your own conanfile.py and you are doing your lipo inside the recipes, that is good. The problem is with ConanCenter existing recipes, which are impossible to make them "fat" binaries without very heavy modifications that are beyond to what can be done in ConanCenter.

So, to summarize, I think there are 2 difererent possible approaches:

Keep the Conan package binaries single-architecture, and try to "lipo" them in a post-deploy step. Installing the multiple architectures and doing the lipo would be relatively doable, the main problem is that CMakeDeps would no longer be able to represent those binaries. It might be possible for you to create a custom generator CMakeFatDeps or something like that, that tries to mimic what CMakeDeps does, but for the fat binaries. That might be more challenging, as the CMakeDeps generator is the most complex one.
If you manage all the dependencies recipes, you can try to make the Conan packages build and store universal binaries.

I'd probably suggest to try to do a small proof of concept based on https://github.com/conan-io/conan/blob/develop2/test/functional/toolchains/cmake/test_universal_binaries.py, and let me know how it goes.

cjserio commented 1 month ago

Hi! Again, thanks for taking the time to dig into this. I know it's a weird use case. Your experimental setting of -s"arch=armv8|x86_64" works great for dependencies that build with cmake. It's exactly what we needed. I'll call this is Dependency Type 1. Dependency Type 2 are dependencies that use autotool to build. I found that autotoolstoolchain.py was spitting out an error and not even trying to do universal binaries. I removed the guard at the top of the class and manually added "-arch x86_64 -arch arm64" to the CXXFLAGS and CFLAGS and those too "just worked" and I ended up with fat binaries. Then I found Dependency Type 3. These are dependencies that have architecture specific public headers and/or assembly code that's only expecting there to be one architecture at a time. And this is the set of dependencies that killed me. (libjpeg is a good example of type 3).

So after thinking this through, I think what I need to do is this: 1) All of our platforms (Windows/Mac/Linux) will use Conan "the right way" and compile x86_64 and arm64 binaries completely separately and make no use of lipo in the recipe/install phase. 2) Windows/Linux will use cmake "the right way" by doing find_library and target_link_libraries using CMakeDeps output to help. 3) Mac will need to be special. The mac process will need to have a post-process pass that does a "lipo deploy" to a temporary build products location, and then we'll have to manually add target_link_library calls to our cmake on mac-only to point to these manually created libraries and header locations.

So my question to you is, is there a good way to know whether or not a dependency is a header-only one that would NOT need a lipo deploy? Because if it's header-only, we can just use find_library on it normally. But if it's actually got a binary, I need to do a lipo-deloy and then generate a little bit of cmake.

memsharded commented 1 month ago

So my question to you is, is there a good way to know whether or not a dependency is a header-only one that would NOT need a lipo deploy? Because if it's header-only, we can just use find_library on it normally. But if it's actually got a binary, I need to do a lipo-deloy and then generate a little bit of cmake.

If you mean, while you are iterating the dependencies of a given package or a given dependency graph, you can query the package_type as it is part of the public interface (I have just realized we need to update https://docs.conan.io/2/reference/conanfile/methods/generate.html#conan-conanfile-model-dependencies to include it, it is there and you can do self.dependencies["mydep"].package_type == "header-library" for example or iterate for req, dep in self.dependencies.host.items(): if dep.package_type == ... (see https://docs.conan.io/2/reference/conanfile/methods/generate.html#iterating-dependencies)

cjserio commented 1 month ago

Slight tangent for a new idea. What do you think of this? We plan on using artifactory anyway so that our developers and CI/CD runners don't have to recompile the dependencies. What if for mac, the process was like this:

1) Build dependency for armv8, build dependency again for x86_64 2) Run lipo-deploy on the build output for those two and create a pack with includes, fat libraries, resources etc. 3) Upload this pack to artifactory as the dependency name/version with a arch of armv8|x86_64 so that when our dependencies as for it, it'll exist and not need to be rebuilt.

The "win" for this method over the others is that as far as Conan is concerned, it's a legitimate universal package so cmake works properly and I can use find_library on all platforms.

If this seems sane, what's the best way to implement step #3? I did a trial of this with libpng. I took the libpng recipe and added an export_sources step that copies the pack over as a whole. I removed the build step. I replace the package step with a simple copy of that package to the package folder. This seems to work but it requires a lot of manual steps. Is there a smarter way to automate this?

memsharded commented 1 month ago

Upload this pack to artifactory as the dependency name/version with a arch of armv8|x86_64 so that when our dependencies as for it, it'll exist and not need to be rebuilt.

Yes, I think this is the intended usage of the armv8|x86_64 feature. Together with the fact that CMake can actually build it more automatically than others, so your steps 1 & 2 might be more unnecessary for CMake.

If this seems sane, what's the best way to implement step https://github.com/conan-io/conan/issues/3? I did a trial of this with libpng. I took the libpng recipe and added an export_sources step that copies the pack over as a whole. I removed the build step. I replace the package step with a simple copy of that package to the package folder. This seems to work but it requires a lot of manual steps. Is there a smarter way to automate this?

If you are building in user space (user folders) then the best could be to use conan export-pkg feature, intended to package directly from binaries in user folders. So you can keep the build() and others exactly the same, no need to remove it, and only package() should take into account the different cases.