Misdetection of early 64-bit Intel Macs as 32-bit

ryandesign commented 4 years ago

Meson misidentifies early 64-bit Macs as 32-bit. This causes the build systems of other programs that use meson (such as babl) to misbehave or fail on those old systems.

The very first Intel Macs used 32-bit Intel Core processors. On these Macs, uname -m returns i386 as you might expect.

Then Apple switched to 64-bit Intel Core 2 processors, but continued to boot them with a 32-bit kernel. Here, uname -m returns i386 to denote the kernel architecture, though they are happy to run 64-bit user programs. It is these systems on which meson misbehaves.

Later, Apple switched to a 64-bit kernel. In this case, uname -m returns x86_64. I presume it is these Macs on which meson was tested.

On some Intel Macs, whether the 32-bit or the 64-bit kernel is used depends on the macOS version or on user configuration.

Meson's detect_cpu and detect_cpu_family begin by using Python's platform.machine, which gets its value from os.uname().machine, which comes from uname -m.

The problem is that most people are using meson to build regular user software, not kernel extensions. For building regular user software, it is not useful to know the kernel architecture (what uname -m returns); one only needs to know the user architecture.

I don't know how to determine the user architecture using a single command. But, having gotten to the point of knowing that you are on an "i386" processor running Darwin, you could then run sysctl -n hw.cpu64bit_capable; if the answer is 1, it's really an x86_64 processor. Do allow for the possibility that this might fail, since sysctl -n hw.cpu64bit_capable is only available in Mac OS X 10.5 and up. But on 10.5 and earlier, the default build architecture was 32-bit, so misidentifying 64-bit 10.4 machines as 32-bit isn't a huge problem.

scivision commented 4 years ago

This might be a challenge to maintain inside Meson, since it requires long-unsupported MacOS version (current oldest supported MacOS is 10.13) on old hardware.

It looks like you might have been able to workaround the issue from within meson.build?

ryandesign commented 4 years ago

If you are referring to the change that the reporter of our ticket proposed, I explained there why the change is incorrect.

I understand that support for older systems is more challenging, but it is not going to be feasible to explain this problem to the developers of every project that uses meson's CPU detection capabilities and expect them to work around it in their meson.build. The fix should be in meson. Meson needs to document whether its CPU detection functions detect the kernel arch or the user arch. There needs to be a way to detect the user arch. If anyone uses meson to build kernel extensions, then there also needs to be a way to detect the kernel arch. Maybe meson needs separate functions for these or a flag to differentiate the two cases.

macOS 10.13 is 2 years old. It is unreasonable for meson not to support earlier systems. Meson is a build system. Build systems need to be compatible with the widest possible range of operating systems and versions. Meson builds fine on Mac OS X 10.6 and later in MacPorts (haven't tested earlier than that).

scivision commented 4 years ago

Could you please check if darwin_arch branch has the suitable changes for detecting the CPU as desired? https://github.com/mesonbuild/meson/compare/darwin_arch

ryandesign commented 4 years ago

Thanks!

It's actually the output of sysctl -n hw.cpu64bit_capable, not its return/exit code, that would be 1 on a 64-bit CPU.

Also, re the comments, it's not that the OS is 32-bit; the OS is 64-bit, but in some situations, the 32-bit kernel gets used.

scivision commented 4 years ago

OK thanks I updated c8bfa6b

kencu commented 4 years ago

thank you for making this effort . I'll be happy to test it.

kencu commented 4 years ago

I have had some issues with the suggested change so far. I will return once I have a better recommendation.

ryandesign commented 4 years ago

OK thanks I updated c8bfa6b

Thanks. That almost worked: sysctl's 1 output was appearing on the terminal instead of being returned into the variable. The following additional patch fixed it:

--- mesonbuild/environment.py.orig  2019-11-26 00:46:12.000000000 -0600
+++ mesonbuild/environment.py   2019-11-26 00:53:22.000000000 -0600
@@ -302,7 +302,7 @@
     """
     try:
         ret = subprocess.run(['sysctl', '-n', 'hw.cpu64bit_capable'],
-                             universal_newlines=True, stderr=subprocess.DEVNULL).stdout
+                             universal_newlines=True, stdout=subprocess.PIPE, stderr=subprocess.DEVNULL).stdout
         trial = 'x86_64' if ret.strip() == '1' else 'x86'
     except Exception:
         # very old MacOS version with implicit 32-bit CPU due to calling if-elif stack

This allowed me to build orc on a 64-bit Snow Leopard machine (which had been failing with the same error: instruction requires: Not 64-bit mode error as babl).

However, I'm still not convinced this is the correct type of fix. I'm just not familiar enough with meson to understand the purpose of this function. babl and orc both seem to be using this function to determine what architecture they should be building for, but that is not the question the function is currently answering. On Mac OS X 10.5 and earlier, the default compiler arch was 32-bit, even on 64-bit systems; a user could supply a flag like -arch x86_64 or -arch ppc64 or -m64 to switch to 64-bit building. Conversely, on Mac OS X 10.6 and later, the default compiler arch was x86_64 on 64-bit systems, and a user could supply a flag like -arch i386 or -arch ppc or -m32 to switch to 32-bit building. A user (or a package manager like MacPorts) may be supplying such flags when they run meson. If this function should answer the question of what arch we are building for, then the only way I know of to determine that is to actually use the compiler to compile something with the user-supplied flags and see what arch it ends up being.

kencu commented 4 years ago

Ryan, good fix; that is the exact error I was seeing as well.

meson uses a "cross file" when building for a different arch than the build system. This may not easily fit into Apple/Macports method of doing cross, or +universal, builds.

ryandesign commented 4 years ago

Even leaving aside the issue of universal builds, just consider the case of Mac OS X 10.5 running on a 64-bit processor. The default compiler build arch is 32-bit. What answer does meson's detect_cpu / detect_cpu_family give? What answer should it give? What answer do projects' meson build files expect them to give? What questions are those functions supposed to be answering?

Does using a "cross file" change the return value of detect_cpu / detect_cpu_family?

I searched the source code of orc and babl. Neither of them mention these functions, so they must be using some other meson functionality that internally calls those functions.

kencu commented 4 years ago

I believe, from my reading so far, that meson's premise is that it is building for the current machine, and doing it's best to automatically detect that, as the common use case.

If you want to build for anything but the current machine, you are meant to use a "cross file" and specify all those parameters yourself if they deviate from the automatically detected parameters.

ryandesign commented 4 years ago

I believe, from my reading so far, that meson's premise is that it is building

Building what? Something that runs in user space, or kernel space?

for the current machine,

For what architecture? The one the OS-supplied compiler uses by default? If so:

and doing it's best to automatically detect that, as the common use case.

How does meson detect that? I don't think it does.

kencu commented 4 years ago

I think you have it sorted out as far as I know it in your comment here https://trac.macports.org/ticket/59864.

kencu commented 4 years ago

This does detect the processor's arch capability properly with Ryan's tweak, and it does fix the build of 64bit software on systems with 32bit kernels.

However this is not yet a full fix. It is quite possible that someone would want to build 32bit on a system that is 64bit capable, and this is commonly done (although it's true that Apple's very latest OS release no longer supports 32bit compiling).

Looking through environment.py, I see that a similar issue arose and was dealt with for the case of building 32bit on 64bit windows, in def detect_windows_arch. In that case, it was decided that requiring a cross-file for this was a poor user experience. The issue was resolved by using several compiler tests to see if 32bit compiling had been requested, and if so, returning 'x86' as the system architecture.

Similar logic might be applied to the macOS situation. Compilers on Apple define __LP64__ if building 64 bit, and that define is the recommended one to use for code path branches for 64bit vs 32bit specific code. If __LP64__ is defined, then 64bit compiling is being requested. There might be other defines, like __i386__ that could be used; I haven't investigated fully.

So we could branch on that, returning trial = x86_64 if __LP64__ is defined, and x86 if not. Or also add the added processor test to the x86_64 test as well, and only define it if the processor supports it and 64bit is requested.

A related issue that I am thinking through is the Apple compiler's multi-arch capability. Both clang and several versions of gcc on Apple can accept multiple -arch flags, and if there is more than one architecture build being requested, the compiler compiles once for each architecture and then melds them together into a multi-arch binary with "lipo". I am not sure how that multi-arch building might work with this logic -- in MacPorts, we have a way around this using separate arch builds that we lipo together in the destrooting phase, and this might be the only option for meson to use.

barracuda156 commented 2 years ago

Then Apple switched to 64-bit Intel Core 2 processors, but continued to boot them with a 32-bit kernel. Here, uname -m returns i386 to denote the kernel architecture, though they are happy to run 64-bit user programs. It is these systems on which meson misbehaves.

@ryandesign @kencu Is it a similar case to the following? https://github.com/mesonbuild/meson/pull/10442

ryandesign commented 2 years ago

@barracuda156 Sorry I don't have any advice for you about your issue. My understanding of meson has not improved since filing this issue and the questions I asked here about the intended operation of meson were never answered.

barracuda156 commented 1 year ago

@ryandesign Indeed, two years and issue still pending.

eli-schwartz commented 1 year ago

I don't know much about how macOS works, but I do know that mesonbuild/environment.py is generally speaking responsible for detecting the CPU, and we explicitly handle several quirks like this, including the case of x86_64 when a/the compiler has defined __i386__.

The point here becomes, that we can tell from the detected compiler that this manifestly is a 32-bit machine, we can tell because it's a native build using a 32-bit compiler and that's a perfectly reasonable definition of "is a 32-bit machine" even if the kernel itself is 64-bit.

The cpu/cpu_family is supposed to reflect the detected facts of the build, not dictate what the build shall be.

We even acknowledge this with the code comment: "Add more quirks here as bugs are reported."

On Mac OS X 10.5 and earlier, the default compiler arch was 32-bit, even on 64-bit systems; a user could supply a flag like -arch x86_64 or -arch ppc64 or -m64 to switch to 64-bit building. Conversely, on Mac OS X 10.6 and later, the default compiler arch was x86_64 on 64-bit systems, and a user could supply a flag like -arch i386 or -arch ppc or -m32 to switch to 32-bit building. A user (or a package manager like MacPorts) may be supplying such flags when they run meson. If this function should answer the question of what arch we are building for, then the only way I know of to determine that is to actually use the compiler to compile something with the user-supplied flags and see what arch it ends up being.

This sounds not dissimilar to using CC='gcc -m32' to compile on a 64-bit Linux with multi-arch. Meson detects this quite well for me, as:

Build type: native build

Host machine cpu family: x86
Host machine cpu: i686

kencu commented 1 year ago

when we last looked at this as above, on macOS meson basically just looked at the output of uname -m and set the build to that. That was broken on a bunch of systems that built 64bit on 32bit kernels, so MacPorts hacked in a test to see if the machine could be one of those, and forced the build to 64bit if so.

That actually works quite well. I added crossfiles for all non-native arch builds when building "universal" which MacOS likes to do.

Things in `environment.py' have evolved somewhat over the past three years. Someone with these machines and interest will need to sit down and sort through it.

AFAIK, very few to nobody on the meson dev team use Macs, and of those, none would have these ancient systems that show the issues, so it's either PRs from interested users or live with it.

kencu commented 1 year ago

the best approach for macOS (as I indicated above in 2019 but never implemented) is probably to first find out the host machine cpu family (three options at present) and then test for __LP64__ which will indicate if you're in the 32bit or 64bit version of that family.

That sounds closer to something that is resilient and has no fancy stuff.

The only other macOS issue that then comes up is that meson's configure-time arch detection and code branching (rather than using build-time arch detection and code branching in the source files) means that MacOS's ability to build universal binaries in one go (by using multiple arch flags) will pretty much never work properly with meson.

Therefore workarounds like this one I assisted with here:

https://github.com/XQuartz/XQuartz/blob/57c5d8aa4df91483ada70f98bf2e3a54a0b07cf5/compile.sh#L379

will always be needed.

eli-schwartz commented 1 year ago

The only other macOS issue that then comes up is that meson's configure-time arch detection and code branching (rather than using build-time arch detection and code branching in the source files) means that MacOS's ability to build universal binaries in one go (by using multiple arch flags) will pretty much never work properly with meson.

I'm not a macOS person, and maybe that's why I don't even understand what this means.

I thought universal binaries was a thing where you build two binaries, then crudely glue them together so that all users install software that's twice as big instead of choosing which download is correct for your platform. AFAICT nothing about my impression is directly contracted by what you said, but there seems to be some sort of implication that instead, projects themselves can edit their source files with some sort of macOS-specific magic to branch and compile some of it under one arch, and some of it under another arch? (Not entirely clear to me why this helps.)

Nevertheless, why doesn't this work with Meson, at least assuming the software sources are written to work that way? From Meson's side of things, it should really only care that the compiler you defined can compile valid binaries. I guess stuff like cc.find_library() could theoretically be edge cases, but at least that much probably works, with handholding -- we run mesonlib.darwin_get_object_archs() on the file to check that it's valid for the currently detected machine, which I suppose is one of the multiple you're compiling for? Maybe you would sometimes find the wrong ones, so the build fails at runtime?

If Meson just needs to be taught how to detect stuff like this and not fail to configure, then there's no reason it would "pretty much never [be able to] work properly", it just takes someone who is invested in the problem and can put in that effort. I do admit that that is a non-trivial investment to learn how some fairly complicated Meson internals work. :(

Anyway, all of Meson's own configure-time detection of anything has relevance to whether Meson detects a compiler, finds that libraries or other dependencies are available, and configures successfully... but it has no bearing on the project's own source code branching, other than that projects are permitted to check host_machine.cpu() and script their own logic. If it is possible for any build system to get it right, then it should be possible (with enough work) to make Meson get it right.

kencu commented 1 year ago

when meson detects it is configuring on an x64_64 system, for example, there is a lot of software that will branch on that in meson.build to include x86_64-specific assembly files, or options, etc. And as that is more prevalent, I see more and more software doing that. That is "configure-time" branching.

In other builds, for example ffmpeg, the assembly files will test to see what the processor is, using some defined macro for example, and will either include this or that bit of code depending on the macro. That is "build time" branching.

Apple software tries as much as possible to be processor agnostic... so that one downloaded binary will work on the Apple system you have, whether that be Monterey-Intel, or Monterey-arm64. The compilers are familiar with this idea, going back to forever ago -- 2000 or so, at least. Now I didn't decide that users might not know what processor they have -- but this is what Apple does, and the whole software community embraces it.

So for a macOS build, you would do something like: CC="clang -arch x86_64 -arch arm64"

The compiler takes that, does some magic, and your resulting binary would work on all such systems natively. That goes to the App Store, or into a DMG, marked "MySoftware.app" and users will get a good experience. Yes, there is some bloat.

So when meson tests for the build arch, and then the meson.build includes specific files based on the build arch like it often does -- that completely breaks the plan. OTOH, when software branches in the source files based on macro tests, then 'multi-arch' one-pass building works fine.

I see a lot of software including different files based on the detect arch in their meson.build files... more and more.

jpakkane commented 1 year ago

So for a macOS build, you would do something like: CC="clang -arch x86_64 -arch arm64"

This is typical Apple behaviour of "works if you only care about a 100% pure Xcode setup and fails on every cross platform project in the world so sucks to be you, don't support other platforms". Most projects require some sort of a configuration header (like config.h). The contents of said file would be different for an x86_64 and arm64. That can not work, you can't have the same file name point to different physical files without editing said header (sometimes heavily).

FWICT Apple does provide a tool that can take two DMG files built for different platforms and merging them into one so the end result is the same as building with that dual setup.

eli-schwartz commented 1 year ago

So it sounds like this isn't actually a problem with Meson, but rather a problem with projects that choose to do this, and Meson simply offers utility functions that can be used to implement that project choice?

If those projects used autotools, they would do exactly the same checks, write out the same config.h or define the same macros, and have the same problem.

But other projects could just use meson and not do CPU checks in meson.build, and this macOS thing would just work?

In other words, it's a culture issue, not a technical one, although Meson does communicate to a certain extent the claim that the "checks in meson.build" approach is reasonable.

kencu commented 1 year ago

autotools projects don't generally branch on the build arch in configure. Neither do cmake projects, for the most part. (there are examples of some that do, of course, for each).

yes, it's a culture thing, agreed.

I was just pointing out that this (multi arch one-go building) will likely never work right with meson, primarily because a lot of projects look at the build arch in meson.build and configure themselves differently there.

We can (and do) work around it with manually doing separate builds and using a system tool to merge them.

eli-schwartz commented 1 year ago

autotools projects don't generally branch on the build arch in configure. Neither do cmake projects, for the most part.

"Neither do meson projects, for the most part."

Sure, some projects branch on this, but some of those projects were ported from autotools to meson, and... they used to do the same before they ported.

Projects that have different asm source code files on different build arches are also reasonably common... well, for projects that use asm at all, it's a likely need. That is a... pretty darn common use of configure time branching. And they certainly do that with cmake and autotools as well.

I was just pointing out that this (multi arch one-go building) will likely never work with meson.

And your wording was confusing. It's still confusing, but I think I now understand what you meant to say.

You meant to say "will likely never work with common meson projects".

kencu commented 1 year ago

re: multiple arch flags, cf https://github.com/mesonbuild/meson/issues/8206

rjdbcm commented 1 year ago

I have been using this as a workaround for early 686 processors. Note: __LP64__ is a meson bool and not the actual compiler-defined macro.

cc = meson.get_compiler('c')
if cc.sizeof('long') == 8
    __LP64__ = true
endif

mesonbuild / meson

Misdetection of early 64-bit Intel Macs as 32-bit #6187