Xpra-org / gtk-osx-build

Build setup to help building the Mac OS X port of GTK+
http://gtk-osx.sourceforge.net/
0 stars 2 forks source link

investigate building Python `./configure --with-system-ffi` #27

Open totaam opened 1 year ago

totaam commented 1 year ago

Details in https://github.com/jralls/gtk-osx-build/pull/81#issuecomment-1451364002

totaam commented 1 year ago

Now hitting this issue on amd64:

*** Building python3-py2app *** [60/138]
python3 setup.py build --build-base /Users/macos/.cache/jhbuild/build/py2app-0.28.5
/Users/macos/gtk/inst/lib/python3.11/site-packages/setuptools/__init__.py:84: _DeprecatedInstaller: setuptools.installer and fetch_build_eggs are deprecated.
!!

        ********************************************************************************
        Requirements should be satisfied by a PEP 517 installer.
        If you are using pip, you can try `pip install --use-pep517`.
        ********************************************************************************

!!
  dist.fetch_build_eggs(dist.setup_requires)
WARNING: The wheel package is not available.
Traceback (most recent call last):
  File "/Users/macos/gtk/source/py2app-0.28.5/setup.py", line 284, in <module>
    setup(
  File "/Users/macos/gtk/inst/lib/python3.11/site-packages/setuptools/__init__.py", line 107, in setup
    return distutils.core.setup(**attrs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/macos/gtk/inst/lib/python3.11/site-packages/setuptools/_distutils/core.py", line 147, in setup
    _setup_distribution = dist = klass(attrs)
                                 ^^^^^^^^^^^^
  File "/Users/macos/gtk/inst/lib/python3.11/site-packages/setuptools/dist.py", line 486, in __init__
    _Distribution.__init__(
  File "/Users/macos/gtk/inst/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 283, in __init__
    self.finalize_options()
  File "/Users/macos/gtk/inst/lib/python3.11/site-packages/setuptools/dist.py", line 924, in finalize_options
    for ep in sorted(loaded, key=by_order):
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/macos/gtk/inst/lib/python3.11/site-packages/setuptools/dist.py", line 923, in <lambda>
    loaded = map(lambda e: e.load(), filtered)
                           ^^^^^^^^
  File "/Users/macos/gtk/inst/lib/python3.11/importlib/metadata/__init__.py", line 202, in load
    module = import_module(match.group('module'))
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/macos/gtk/inst/lib/python3.11/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen importlib._bootstrap>", line 1204, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 940, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/Users/macos/gtk/source/py2app-0.28.5/py2app/build_app.py", line 23, in <module>
    import macholib.dyld
  File "/Users/macos/gtk/inst/lib/python3.11/site-packages/macholib/dyld.py", line 5, in <module>
    import ctypes
  File "/Users/macos/gtk/inst/lib/python3.11/ctypes/__init__.py", line 8, in <module>
    from _ctypes import Union, Structure, Array
ImportError: dlopen(/Users/macos/gtk/inst/lib/python3.11/lib-dynload/_ctypes.cpython-311-darwin.so, 2): Library not loaded: /Users/macos/gtk/inst/lib/libffi.7.dylib
  Referenced from: /Users/macos/gtk/inst/lib/python3.11/lib-dynload/_ctypes.cpython-311-darwin.so
  Reason: image not found

libffi.7.dylib does not exist in $JHBUILD_PREFIX.

$ otool -L /usr/lib/libffi.dylib 
/usr/lib/libffi.dylib:
    /usr/lib/libffi.dylib (compatibility version 1.0.0, current version 1.0.0)
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1252.250.1)

(not helpful!)

totaam commented 1 year ago

Reading the cffi source:

/* Define the following macro ONLY if you trust libffi's version of
 * ffi_closure_alloc() more than the code in malloc_closure.h.
 * IMPORTANT: DO NOT ENABLE THIS ON LINUX, unless you understand exactly
 * why I recommend against it and decide that you trust it more than my
 * analysis below.
 *
 * There are two versions of this code: one inside libffi itself, and
 * one inside malloc_closure.h here.  Both should be fine as long as the
 * Linux distribution does _not_ enable extra security features.  If it
 * does, then the code in malloc_closure.h will cleanly crash because
 * there is no reasonable way to obtain a read-write-execute memory
 * page.  On the other hand, the code in libffi will appear to
 * work---but will actually randomly crash after a fork() if the child
 * does not immediately call exec().  This second crash is of the kind
 * that can be turned into an attack vector by a motivated attacker.
 * So, _enabling_ extra security features _opens_ an attack vector.
 * That sounds like a horribly bad idea to me, and is the reason for why
 * I prefer CFFI crashing cleanly.
 *
 * Currently, we use libffi's ffi_closure_alloc() on NetBSD.  It is
 * known that on the NetBSD kernel, a different strategy is used which
 * should not be open to the fork() bug.
 *
 * This is also used on macOS, provided we are executing on macOS 10.15 or
 * above.  It's a mess because it needs runtime checks in that case.
 */

And ffi_closure_alloc is exactly where we are seeing the crash, from the backtrace in #22:

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   _objc.so                        0x00000001106670eb PyObjC_ffi_closure_alloc + 331
1   _objc.so                        0x00000001106a2388 PyObjC_RegisterStructType + 1656
...
totaam commented 1 year ago

Instead of fighting too many losing battles with opaque toolchains, I raised the minimum requirements to MacOS 12: 7dd24486ea880fa942033430723c8c954e379c92 Also here: https://github.com/Xpra-org/xpra/wiki/Platforms

That seems to work.

cpatulea commented 10 months ago

With 5.0.4 on a M1 I'm still unable to run Xpra:

$ brew install xpra
Running `brew update --auto-update`...
==> Auto-updated Homebrew!
Updated 2 taps (homebrew/core and homebrew/cask).
==> New Casks
creality-print              heynote                     keyboard-cowboy             lw-scanner                  roam

You have 37 outdated formulae and 1 outdated cask installed.

==> Upgrading 1 outdated package:
xpra 5.0.3,0 -> 5.0.4,1
==> Upgrading xpra
==> Downloading https://www.xpra.org/dists/osx/arm64/Xpra-arm64-5.0.4-r1.pkg
####################################################################################################################################### 100.0%
==> Uninstalling packages with sudo; the password may be necessary:
org.xpra.pkg
Password:
==> Removing files:
/Applications/Xpra.app
/usr/local/bin/Xpra*
==> Running installer for xpra with sudo; the password may be necessary.
installer: Package name is Xpra 5.0.4
installer: Installing at base path /
installer: The install was successful.
==> Purging files for version 5.0.3,0 of Cask xpra
🍺  xpra was successfully upgraded!
~$ xpra
grep: /usr/share/locale/locale.alias: No such file or directory
xpra main error:
Traceback (most recent call last):
  File "/Applications/Xpra.app/Contents/Resources/lib/python/xpra/scripts/main.py", line 121, in main
    return run_mode(script_file, cmdline, err, options, args, mode, defaults)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Applications/Xpra.app/Contents/Resources/lib/python/xpra/scripts/main.py", line 455, in run_mode
    return do_run_mode(script_file, cmdline, error_cb, options, args, mode, defaults)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Applications/Xpra.app/Contents/Resources/lib/python/xpra/scripts/main.py", line 599, in do_run_mode
    return gui.main(cmdline)
           ^^^^^^^^^^^^^^^^^
  File "/Applications/Xpra.app/Contents/Resources/lib/python/xpra/gtk_common/gui.py", line 263, in main
    from xpra.platform.darwin.gui import wait_for_open_handlers
  File "/Applications/Xpra.app/Contents/Resources/lib/python/xpra/platform/darwin/gui.py", line 15, in <module>
    import objc                         #@UnresolvedImport
    ^^^^^^^^^^^
  File "objc/__init__.pyc", line 6, in <module>
  File "objc/_objc.pyc", line 14, in <module>
  File "objc/_objc.pyc", line 10, in __load
  File "imp.pyc", line 343, in load_dynamic
ImportError: dlopen(/Applications/Xpra.app/Contents/Resources/lib/python/lib-dynload/objc/_objc.so, 0x0002): symbol not found in flat namespace '_ffi_find_closure_for_code_np'

Is there something I'm doing wrong / something else I could try?

totaam commented 10 months ago

@cpatulea someone needs to figure out what to do with libffi / python / ctypes / pyobjc. I've already spent too much time on this and I don't have the hardware to test.

totaam commented 9 months ago

My guess is that we're hitting a variation of https://bugs.python.org/issue44556#msg406626 One of the libraries (pyobc?) ends up picking up the wrong headers.

See also:

cpatulea commented 9 months ago

Just some notes as I'm looking into this issue. If it's too much noise let me know.

Trying to reproduce by just loading objc module:

/Applications/Xpra.app/Contents$ ./Resources/bin/Python
Python 3.11.6 (main, Oct  3 2023, 10:59:38) [Clang 14.0.3 (clang-1403.0.22.14.1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import objc
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "objc/__init__.pyc", line 6, in <module>
  File "objc/_objc.pyc", line 14, in <module>
  File "objc/_objc.pyc", line 10, in __load
  File "imp.pyc", line 343, in load_dynamic
ImportError: dlopen(/Applications/Xpra.app/Contents/Resources/lib/python3.11/lib-dynload/objc/_objc.so, 0x0002): Library not loaded: @executable_path/../Frameworks/libffi.8.dylib
  Referenced from: <FBEF7DE6-D18A-39E2-94D7-1A6AEC9D6221> /Applications/Xpra.app/Contents/Resources/lib/python3.11/lib-dynload/objc/_objc.so
  Reason: tried: '/Applications/Xpra.app/Contents/Resources/Frameworks/libffi.8.dylib' (no such file), '/usr/local/lib/libffi.8.dylib' (no such file), '/usr/lib/libffi.8.dylib' (no such file, not in dyld cache)

libffi does seem to be packaged:

/Applications/Xpra.app/Contents$ find . -name '*libffi*'
./Resources/lib/libffi.8.dylib

so I think ./MacOS/PythonExecWrapper is responsible for setting DYLD_LIBRARY_PATH so that the packaged libffi is used. And indeed, this reproduces the original error message:

$ DYLD_LIBRARY_PATH=./Resources/lib ./Resources/bin/Python -m objc
Traceback (most recent call last):
  File "<frozen runpy>", line 189, in _run_module_as_main
  File "<frozen runpy>", line 148, in _get_module_details
  File "<frozen runpy>", line 112, in _get_module_details
  File "objc/__init__.pyc", line 6, in <module>
  File "objc/_objc.pyc", line 14, in <module>
  File "objc/_objc.pyc", line 10, in __load
  File "imp.pyc", line 343, in load_dynamic
ImportError: dlopen(/Applications/Xpra.app/Contents/Resources/lib/python3.11/lib-dynload/objc/_objc.so, 0x0002): symbol not found in flat namespace '_ffi_find_closure_for_code_np'

Here's where pyobjc references this symbol: https://github.com/ronaldoussoren/pyobjc/blob/0be0e969ce29b6c08cd68e709593d1605a8cd937/pyobjc-core/Modules/objc/libffi_extra.m#L79

Indeed, packaged libffi does not have the symbol:

$ nm -g ./Resources/lib/libffi.8.dylib | grep -c find_closure
0

Here's the Apple libffi which has the symbol: https://github.com/apple-oss-distributions/libffi/blob/fbc0bca21c2649677d64bfaa19acdf91d1756dec/darwin/include/ffi.h#L396

[edit] The "system" libffi is not a real file on the filesystem (/usr/lib/libffi.dylib) but loaded from the "built-in dynamic linker cache" (https://ladydebug.com/blog/2021/04/16/big-sur-and-built-in-dynamic-linker-cache/).

CommandLineTools has ffi.h, which does have the symbol:

$ find / -xdev -name ffi.h
...
/Library/Developer/CommandLineTools/SDKs/MacOSX13.3.sdk/usr/include/ffi/ffi.h

$ grep ffi_find_closure_for_code_np /Library/Developer/CommandLineTools/SDKs/MacOSX13.3.sdk/usr/include/ffi/ffi.h
ffi_closure * ffi_find_closure_for_code_np(void *code);

/Library/Developer/CommandLineTools/SDKs/MacOSX13.3.sdk$ ag -u ffi_find_closure_for_code_np
usr/include/ffi/ffi.h
396:ffi_closure * ffi_find_closure_for_code_np(void *code);

usr/lib/libffi.tbd
12:    symbols:         [ _ffi_call, _ffi_closure_alloc, _ffi_closure_free, _ffi_find_closure_for_code_np,

I think Homebrew libffi does not have the needed symbol:

$ brew install libffi
...
==> Summary
🍺  /opt/homebrew/Cellar/libffi/3.4.4: 17 files, 724.8KB

$ nm -g /opt/homebrew/opt/libffi/lib/libffi.8.dylib | grep ffi_find_closure_for_code_np

Here's a libffi issue with very similar issue (just different symbol) suggesting a workaround by setting some ldflags: https://github.com/ffi/ffi/issues/946#issuecomment-1241374279

I might try to build MacOS binaries (https://github.com/Xpra-org/xpra/blob/master/docs/Build/MacOS.md) to see if I can reproduce the issue with my own build.

gtk-osx-setup.sh worked, fetched jhbuildrc-custom, but jhbuild update is failing because www.cairographics.org is down (!?):

*** Checking out pixman *** [15/137]
curl --continue-at - -L https://www.cairographics.org/releases/pixman-0.42.2.tar.gz -o /Users/catalinp/gtk/source/pkgs/pixman-0.42.2.tar.gz
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
curl: (7) Failed to connect to www.cairographics.org port 443 after 72 ms: Couldn't connect to server
*** Error during phase checkout of pixman: ########## Error running curl --continue-at - -L https://www.cairographics.org/releases/pixman-0.42.2.tar.gz -o /Users/catalinp/gtk/source/pkgs/pixman-0.42.2.tar.gz *** [15/137]

Tried to just skip that, now hitting another problem with libfreetype being built for x86_64:

*** Building freetype-no-harfbuzz *** [16/137]
ninja
[43/44] Linking C shared library libfreetype.6.20.1.dylib
FAILED: libfreetype.6.20.1.dylib
: && /Library/Developer/CommandLineTools/usr/bin/gcc -arch x86_64...
ld: warning: ignoring file '/opt/homebrew/Cellar/libpng/1.6.40/lib/libpng16.16.dylib': found architecture 'arm64', required architecture 'x86_64'
ld: Undefined symbols:
  _png_create_info_struct, referenced from:
      _Load_SBit_Png in sfnt.c.o
...
clang: error: linker command failed with exit code 1 (use -v to see invocation)

Working on setting up a GitHub workflow for the Mac OS build, to make it more reproducible and easier to quickly test changes. It's getting stuck on the same issue as before (www.cairographics.org down so can't fetch pixman): https://github.com/cpatulea/gtk-osx-build/actions/runs/7391404577/job/20108152043

cpatulea commented 9 months ago

Trying to install pyobjc as recommended by documentation (https://pyobjc.readthedocs.io/en/latest/install.html#installation-using-pip) (using pip):

$ pip3 install pyobjc
...
$ which python3
/opt/homebrew/bin/python3
$ python3
Python 3.11.5 (main, Aug 24 2023, 15:09:45) [Clang 14.0.3 (clang-1403.0.22.14.1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import objc

$ lsof -n -p 59103 |grep objc
Python  59103 catalinp  txt    REG               1,15  2113288            11879288 /opt/homebrew/lib/python3.11/site-packages/objc/_objc.cpython-311-darwin.so

$ nm -g /opt/homebrew/lib/python3.11/site-packages/objc/_objc.cpython-311-darwin.so|grep _ffi_find_closure_for_code_np
                 U _ffi_find_closure_for_code_np

$ otool -L /opt/homebrew/lib/python3.11/site-packages/objc/_objc.cpython-311-darwin.so
...
/opt/homebrew/lib/python3.11/site-packages/objc/_objc.cpython-311-darwin.so (architecture arm64):
    /usr/lib/libffi.dylib (compatibility version 1.0.0, current version 32.0.0)

$ DYLD_PRINT_LIBRARIES=1 python3 -m objc
...
dyld[59441]: <D44889E1-B36E-39D3-9347-81FB797821D2> /usr/lib/libffi.dylib

/usr/lib/libffi.dylib is not a real file on the filesystem, it comes from the "built-in dynamic linker cache" (https://ladydebug.com/blog/2021/04/16/big-sur-and-built-in-dynamic-linker-cache/).

cpatulea commented 9 months ago

Tried a simple hack to disable the packaged libffi.8.dylib:

/Applications/Xpra.app/Contents/Resources/lib$ sudo mv libffi.8.dylib libffi.8.dylib.disabled

$ DYLD_LIBRARY_PATH=./Resources/lib ./Resources/bin/Python -m objc
Traceback (most recent call last):
  File "<frozen runpy>", line 189, in _run_module_as_main
  File "<frozen runpy>", line 148, in _get_module_details
  File "<frozen runpy>", line 112, in _get_module_details
  File "objc/__init__.pyc", line 6, in <module>
  File "objc/_objc.pyc", line 14, in <module>
  File "objc/_objc.pyc", line 10, in __load
  File "imp.pyc", line 343, in load_dynamic
ImportError: dlopen(/Applications/Xpra.app/Contents/Resources/lib/python3.11/lib-dynload/objc/_objc.so, 0x0002): Library not loaded: @executable_path/../Frameworks/libffi.8.dylib
  Referenced from: <FBEF7DE6-D18A-39E2-94D7-1A6AEC9D6221> /Applications/Xpra.app/Contents/Resources/lib/python3.11/lib-dynload/objc/_objc.so
  Reason: tried: './Resources/lib/libffi.8.dylib' (no such file), '/Applications/Xpra.app/Contents/Resources/Frameworks/libffi.8.dylib' (no such file), '/usr/local/lib/libffi.8.dylib' (no such file), '/usr/lib/libffi.8.dylib' (no such file, not in dyld cache)

$ otool -L /Applications/Xpra.app/Contents/Resources/lib/python3.11/lib-dynload/objc/_objc.so
...
    @executable_path/../Frameworks/libffi.8.dylib (compatibility version 10.0.0, current version 10.2.0)

but I guess due to the versioned dependency, that won't work.

cpatulea commented 9 months ago

FYI Homebrew Python also appears to use system libffi:

/opt/homebrew/opt/python@3.11/Frameworks/Python.framework/Versions/3.11$ otool -L lib/python3.11/lib-dynload/_ctypes.cpython-311-darwin.so
lib/python3.11/lib-dynload/_ctypes.cpython-311-darwin.so:
    /usr/lib/libffi.dylib (compatibility version 1.0.0, current version 30.0.0)

though I don't find --with-system-ffi in the formula.

Note from CPython upstream, --with-system-ffi has been removed (https://github.com/python/cpython/commit/25590eb5dee5176f3ac60916b19450f8198e7ffc), with the default being "always use system provided libffi".

cpatulea commented 9 months ago

Managed to set up a GitHub workflow (Mac OS x86_64) that successfully runs gtk-osx-setup.sh, jhbuild update and jhbuild bootstrap-gtk-osx: https://github.com/cpatulea/gtk-osx-build/actions/runs/7394059836

Todo list for next time:

totaam commented 9 months ago

See also https://github.com/orgs/Xpra-org/discussions/4091#discussioncomment-8003248

@cpatulea do you want to make PRs for some of these useful changes?

cpatulea commented 9 months ago

Yes eventually I intend to submit PR.

Thanks for looking through the changes and suggesting ones to submit.

I plan to keep working until I get a full build, then show you everything and see what you want to pick.

cpatulea commented 9 months ago

OK, so after all the adventures with the Mac OS build in https://github.com/Xpra-org/xpra/issues/4017 I'm back to this, which was to test --with-system-ffi:

Before:

$ jhbuild run 'python3 -m objc'
Traceback (most recent call last):
  File "<frozen runpy>", line 189, in _run_module_as_main
  File "<frozen runpy>", line 148, in _get_module_details
  File "<frozen runpy>", line 112, in _get_module_details
  File "/Users/catalinp/gtk/inst/lib/python3.11/site-packages/objc/__init__.py", line 6, in <module>
    from . import _objc
ImportError: dlopen(/Users/catalinp/gtk/inst/lib/python3.11/site-packages/objc/_objc.cpython-311-darwin.so, 0x0002): symbol not found in flat namespace '_ffi_find_closure_for_code_np'

After:

$ sh gtk-osx-setup.sh
pyenv: /Users/catalinp/.new_local/share/pyenv/versions/3.11.5 already exists
continue with installation? (y/N) y
...
checking for --with-system-ffi... yes
$ rm -fr ~/.cache/jhbuild/build/pyobjc-core-10.0/
$ jhbuild buildone -f python3-pyobjc-core
$ otool -L /Users/catalinp/gtk/inst/lib/python3.11/site-packages/objc/_objc.cpython-311-darwin.so
/Users/catalinp/gtk/inst/lib/python3.11/site-packages/objc/_objc.cpython-311-darwin.so:
...
  /Users/catalinp/gtk/inst/lib/libffi.8.dylib (compatibility version 10.0.0, current version 10.2.0)
$ jhbuild run 'python3 -m objc'
Traceback (most recent call last):
  File "<frozen runpy>", line 189, in _run_module_as_main
  File "<frozen runpy>", line 148, in _get_module_details
  File "<frozen runpy>", line 112, in _get_module_details
  File "/Users/catalinp/gtk/inst/lib/python3.11/site-packages/objc/__init__.py", line 6, in <module>
    from . import _objc
ImportError: dlopen(/Users/catalinp/gtk/inst/lib/python3.11/site-packages/objc/_objc.cpython-311-darwin.so, 0x0002): symbol not found in flat namespace '_ffi_find_closure_for_code_np'

so I think probably Python configure --with-system-ffi is not sufficient. Maybe the libffi module still gets picked up first by all the other modules (eg. pyobjc)?

cpatulea commented 9 months ago

I nuked the libffi in gtk install root:

$ rm ~/gtk/inst/lib/libffi.*

$ rm -fr ~/.cache/jhbuild/build/pyobjc-core-10.0/
$ jhbuild buildone -f python3-pyobjc-core
...
$ otool -L /Users/catalinp/gtk/inst/lib/python3.11/site-packages/objc/_objc.cpython-311-darwin.so
...
    /usr/lib/libffi.dylib (compatibility version 1.0.0, current version 32.0.0)
$ jhbuild run 'python3 -m objc'
/Users/catalinp/gtk/inst/bin/python3: No module named objc.__main__; 'objc' is a package and cannot be directly executed

and now we are good!

That "No module named" is a good result, it means the module was imported successfully.

totaam commented 9 months ago

OK, when I looked at this again I thought that maybe we really wanted to ship our own libffi since we can't control which version is going to be present on the system we run on (and the compatibility issues that this seems to create).

So instead of using --with-system-ffi, I was going to try to force --without-system-ffi then try to build with all the libraries with my system ffi headers nuked or at least hidden.

Can we really build something stable without shipping a know good version of libffi to link against?

cpatulea commented 9 months ago

I see your point. I don't know enough about libffi stability to have an opinion on that.

Note that Python upstream is removing --with-system-ffi:

Removed the ``--with-system-ffi`` ``configure`` option; ``libffi`` must now
always be supplied by the system on all non-Windows platforms.  The option
has had no effect on non-Darwin platforms for several releases, and in 3.11
only had the non-obvious effect of invoking ``pkg-config`` to find
``libffi`` and never setting ``-DUSING_APPLE_OS_LIBFFI``.  Now on Darwin
platforms ``configure`` will first check for the OS ``libffi`` and then fall
back to the same processing as other platforms if it is not found.

so perhaps packaging one might stop being an option (or might become difficult to support a different configuration than upstream).

For reference, Homebrew pyobjc also links against system libffi:

$ otool -L /opt/homebrew/lib/python3.11/site-packages/objc/_objc.cpython-311-darwin.so
...
    /usr/lib/libffi.dylib (compatibility version 1.0.0, current version 32.0.0)
dgarnier commented 8 months ago

I'm not sure exactly if this the right issue to describe this, but looking into why it was crashing for me on M2 MBA with MacOS14.2, I decided that since glib had made changes with how it no longer uses fallback for libintl in the post 2.76 version, I would disable the glib-2.78.1 patch. This seems to work for me and I'm not getting crash on launch as I was before.

cpatulea commented 8 months ago

https://github.blog/changelog/2024-01-30-github-actions-introducing-the-new-m1-macos-runner-available-to-open-source/ !!

totaam commented 8 months ago

Cool, now we "just" need to make it work reliably!

totaam commented 8 months ago

I'm going to try this: https://github.com/libffi/libffi/releases/tag/v3.4.6 Lots of aarch64 fixes since 3.4.4 and 3.4.5 says Fixes for AIX, loongson, MIPS, power, sparc64, and x86 Darwin.

cpatulea commented 7 months ago

I had done some builds with --with-system-ffi an on the surface it seemed like the build worked. But now looking more closely, there's a problem during building of Python:

https://github.com/cpatulea/gtk-osx-build/actions/runs/8273499988/job/22637378808

2024-03-14T00:28:23.5995850Z clang -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -std=c11 -Wextra -Wno-unused-parameter -Wno-missing-field-initializers -Werror=implicit-function-declaration -fvisibility=hidden -I./Include/internal -I_ctypes/darwin -I/Library/Developer/CommandLineTools/SDKs/MacOSX14.sdk/usr/include/ffi -I./Include -I. -I/opt/homebrew/opt/readline/include -I/opt/homebrew/include -I/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include -I/private/var/folders/1k/qq3pcbf12vb6vyblh81736p40000gn/T/python-build.20240314002650.3136/Python-3.11.7/Include -I/private/var/folders/1k/qq3pcbf12vb6vyblh81736p40000gn/T/python-build.20240314002650.3136/Python-3.11.7 -c /private/var/folders/1k/qq3pcbf12vb6vyblh81736p40000gn/T/python-build.20240314002650.3136/Python-3.11.7/Modules/_ctypes/callbacks.c -o build/temp.macosx-14.2-arm64-3.11/private/var/folders/1k/qq3pcbf12vb6vyblh81736p40000gn/T/python-build.20240314002650.3136/Python-3.11.7/Modules/_ctypes/callbacks.o -DUSING_MALLOC_CLOSURE_DOT_C=1 -DMACOSX -DHAVE_FFI_PREP_CIF_VAR=1 -DHAVE_FFI_PREP_CLOSURE_LOC=1 -DHAVE_FFI_CLOSURE_ALLOC=1
2024-03-14T00:28:23.7126080Z /private/var/folders/1k/qq3pcbf12vb6vyblh81736p40000gn/T/python-build.20240314002650.3136/Python-3.11.7/Modules/_ctypes/callbacks.c:438:18: error: call to undeclared function 'ffi_prep_closure'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
2024-03-14T00:28:23.7172330Z         result = ffi_prep_closure(p->pcl_write, &p->cif, closure_fcn, p);
2024-03-14T00:28:23.7173100Z                  ^
2024-03-14T00:28:23.7173690Z 1 error generated.
2024-03-14T00:28:24.1751690Z 
2024-03-14T00:28:24.1753070Z Failed to build these modules:
2024-03-14T00:28:24.1780620Z _ctypes      

which later causes multiple problems (pipenv install fails because typogrify won't install properly; and because of that, jhbuild of librsvg fails).

So there's still some issues with --with-system-ffi to be ironed out.

cpatulea commented 7 months ago

Discussion at pyenv: https://github.com/pyenv/pyenv/issues/2137

cpatulea commented 7 months ago

Hmm... at commit https://github.com/cpatulea/gtk-osx-build/commit/55f7d98d16328e9c66ed558df9175691140cb2d1 I'm observing the following:

First build (pyenv) build log:

2024-03-12T20:57:52.0474320Z /Applications/Xcode_15.0.1.app/Contents/Developer/usr/bin/gcc -DNDEBUG -g -fwrapv -O3 -Wall -arch arm64 -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk -mmacosx-version-min=12 -arch arm64 -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk -mmacosx-version-min=12 -std=c11 -Werror=implicit-function-declaration -fvisibility=hidden -I/Users/runner/gtk/source/Python-3.11.8/Include/internal -I_ctypes/darwin -I/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/ffi -I. -IObjects -IPython -I/Users/runner/gtk/inst/include -I/Users/runner/gtk/source/Python-3.11.8/Include -I/Users/runner/.cache/jhbuild/build/Python-3.11.8 -c /Users/runner/gtk/source/Python-3.11.8/Modules/_ctypes/callbacks.c -o build/temp.macosx-12.0-arm64-3.11/Users/runner/gtk/source/Python-3.11.8/Modules/_ctypes/callbacks.o -DUSING_MALLOC_CLOSURE_DOT_C=1 -DMACOSX **-DUSING_APPLE_OS_LIBFFI=1** -DHAVE_FFI_PREP_CIF_VAR=1 -DHAVE_FFI_PREP_CLOSURE_LOC=1 -DHAVE_FFI_CLOSURE_ALLOC=1

second Python build (jhbuild):

2024-03-12T20:34:16.7568590Z clang -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -std=c11 -Wextra -Wno-unused-parameter -Wno-missing-field-initializers -Werror=implicit-function-declaration -fvisibility=hidden -I./Include/internal -I_ctypes/darwin -I/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/ffi -I./Include -I. -I/opt/homebrew/opt/readline/include -I/opt/homebrew/include -I/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include -I/private/var/folders/1k/qq3pcbf12vb6vyblh81736p40000gn/T/python-build.20240312203259.1771/Python-3.11.7/Include -I/private/var/folders/1k/qq3pcbf12vb6vyblh81736p40000gn/T/python-build.20240312203259.1771/Python-3.11.7 -c /private/var/folders/1k/qq3pcbf12vb6vyblh81736p40000gn/T/python-build.20240312203259.1771/Python-3.11.7/Modules/_ctypes/callbacks.c -o build/temp.macosx-14.2-arm64-3.11/private/var/folders/1k/qq3pcbf12vb6vyblh81736p40000gn/T/python-build.20240312203259.1771/Python-3.11.7/Modules/_ctypes/callbacks.o -DUSING_MALLOC_CLOSURE_DOT_C=1 -DMACOSX **-DUSING_APPLE_OS_LIBFFI=1** -DHAVE_FFI_PREP_CIF_VAR=1 -DHAVE_FFI_PREP_CLOSURE_LOC=1 -DHAVE_FFI_CLOSURE_ALLOC=1

and check out this snippet of Python setup.py:

https://github.com/python/cpython/blob/fa7a6f23036537567592647d15f043722c7144ad/setup.py#L1382

        if (not sysconfig.get_config_var("LIBFFI_INCLUDEDIR") and MACOS):
            self.use_system_libffi = True
        else:
            self.use_system_libffi = '--with-system-ffi' in sysconfig.get_config_var("CONFIG_ARGS")

System libffi (Apple libffi) is already the default on Mac OS.

Compare this to the build where I specifically set --with-system-ffi, the pyenv (first) Python does show "checking for --with-system-ffi... yes", but then the clang command-line lacks -DUSING_APPLE_OS_LIBFFI=1 (and fails). Very confusing.

cpatulea commented 7 months ago

It looks like Python is built with system libffi:

$ otool -L ./gtk/inst/lib/python3.11/lib-dynload/_ctypes.cpython-311-darwin.so
./gtk/inst/lib/python3.11/lib-dynload/_ctypes.cpython-311-darwin.so:
    /usr/lib/libffi.dylib (compatibility version 1.0.0, current version 32.0.0)
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1336.0.0)

but objc is built with libffi from source (jhbuild):

$ otool -L ./gtk/inst/lib/python3.11/site-packages/objc/_objc.cpython-311-darwin.so
./gtk/inst/lib/python3.11/site-packages/objc/_objc.cpython-311-darwin.so:
    /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 2048.1.255)
    /System/Library/Frameworks/Foundation.framework/Versions/C/Foundation (compatibility version 300.0.0, current version 2048.1.255)
    /Users/runner/gtk/inst/lib/libffi.8.dylib (compatibility version 10.0.0, current version 10.4.0)
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1336.0.0)
    /usr/lib/libobjc.A.dylib (compatibility version 1.0.0, current version 228.0.0)

That doesn't seem right..

I guess when making the app (Xpra) the rpaths get modified, so it 'looks' like _ctypes depends on the bundled libffi:

$ otool -L /Applications/Xpra.app/Contents/Resources/lib/python3.11/lib-dynload/_ctypes.so
/Applications/Xpra.app/Contents/Resources/lib/python3.11/lib-dynload/_ctypes.so:
    @executable_path/../Frameworks/libffi.8.dylib (compatibility version 10.0.0, current version 10.2.0)
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1319.100.3)

but that's not how it was originally built.

totaam commented 7 months ago

@cpatulea Great bug hunting!

I'm not sure I would be able to figure out the install_name_tool shenanigans that misidentified the dylib to link against.

Can we just build everything against the system libffi and not have one at all in $JHBUILD_PREFIX? This may require changes to only two modules:

(that still leaves the compatibility problems between versions... but with MacOS 12+, not as bad?) Surely it's not as simple as not building libffi ourselves? (one can dream) pyobjc should then use the system one?

cpatulea commented 6 months ago

I tried that, glib then fails to find libffi: https://github.com/cpatulea/gtk-osx-build/actions/runs/8412947646/job/23034441279

I'm planning to spend some trying to to figure out how to get glib to pick up system libffi.

cpatulea commented 6 months ago

On my local machine, needed to manually remove traces of libffi from ~/gtk/inst/, and then easy to reproduce:

Run-time dependency libffi found: NO (tried pkgconfig, framework and cmake)
Not looking for a fallback subproject for the dependency libffi because:
Use of fallback dependencies is disabled.

../../../../gtk/source/glib-2.78.1/meson.build:2129:13: ERROR: Dependency 'libffi' is required but not found.

Homebrew glib does use system libffi:

$ otool -L /opt/homebrew/Cellar/glib/2.78.4/lib/libgobject-2.0.dylib
/opt/homebrew/Cellar/glib/2.78.4/lib/libgobject-2.0.dylib:
...
    /usr/lib/libffi.dylib (compatibility version 1.0.0, current version 30.0.0)

I think Apple doesn't provide a libffi.pc but Homebrew packages their own (?):

$ find / -xdev -name libffi.pc 2>/dev/null
...
/opt/homebrew/Library/Homebrew/os/mac/pkgconfig/10.15/libffi.pc
/opt/homebrew/Library/Homebrew/os/mac/pkgconfig/12/libffi.pc
/opt/homebrew/Library/Homebrew/os/mac/pkgconfig/13/libffi.pc
...

But on my machine at least, pkg-config does seem to find it out of the box:

$ pkg-config --cflags libffi
-I/Library/Developer/CommandLineTools/SDKs/MacOSX13.sdk/usr/include/ffi

... so why did Meson fail to find it?

Ah, because it uses the jhbuild hermetic pkg-config:

# from meson-log.txt
Determining dependency 'libffi' with pkg-config executable '/Users/catalinp/gtk/inst/bin/pkg-config'
...
$ /Users/catalinp/gtk/inst/bin/pkg-config --cflags libffi
Package libffi was not found in the pkg-config search path.
Perhaps you should add the directory containing `libffi.pc'
to the PKG_CONFIG_PATH environment variable
No package 'libffi' found

$ which pkg-config
/opt/homebrew/bin/pkg-config
$ pkg-config --debug --cflags libffi
...
Reading 'libffi' from file '/opt/homebrew/Library/Homebrew/os/mac/pkgconfig/13/libffi.pc'

Looks like we can tell the jhbuild hermetic pkg-config to use Homebrew pc stub files:

$ PKG_CONFIG_PATH=/opt/homebrew/Library/Homebrew/os/mac/pkgconfig/13 /Users/catalinp/gtk/inst/bin/pkg-config --modversion libffi
3.4-rc1

Seems like there is some precedent of using PKG_CONFIG_PATH to point pkg-config to additional places: https://github.com/Xpra-org/gtk-osx-build/blob/80f8b457f5029f0f776689dc3ae0ea8d99beb5b8/jhbuildrc-gtk-osx#L641

cpatulea commented 6 months ago

Hah, foiled by #42

https://github.com/cpatulea/gtk-osx-build/actions/runs/8502895427/job/23287664257

totaam commented 2 months ago

cffi just got updated to 1.17.0: 3fb8628f11a9dbf0534e0ef25d352b7b7ac39a06

@cpatulea any idea how I would go about hiding the system's libffi from every build made via jhbuild? If I have to (temporarily) hide or remove the header file, I'm ok with that too. I want everything built against the library we're bundling, not the undefined mishmash we currently have.

cpatulea commented 2 months ago

Sorry for the delay. I'm not sure how to hide system libffi.

Based on Homebrew builds with system libffi (https://github.com/Xpra-org/gtk-osx-build/issues/27#issuecomment-1874527225) and Python moved to always build with system libffi (https://github.com/Xpra-org/gtk-osx-build/issues/27#issuecomment-1890486692) I was pursuing the opposite direction: to point all jhbuild packages to build with system libffi.

I ran into an obstacle (https://github.com/Xpra-org/gtk-osx-build/issues/27#issuecomment-2029044145) but perhaps with more time, it could be made to work.

totaam commented 1 month ago

I was pursuing the opposite direction: to point all jhbuild packages to build with system libffi.

I don't understand how that can be a good idea seeing that this ticket is about binary incompatibilities between libffi versions and some of the packages that link to it. (cffi, pyobjc, etc)

It seems that rebuilding everything from scratch (deleting the whole gtk directory - and starting from the bootstrap) breaks less than doing incremental builds - which is a shame. This is an acceptable extra time-consuming constraint when doing releases, I guess.

cpatulea commented 1 month ago

My understanding is the problem is not "package is linked with system libffi". It's "package is compiled with vendored headers and linked with system libffi". Header vs binary mismatch.

I think either of these options should work in theory:

  1. Vendored header + vendored binary
  2. System header + system binary

To choose between 1 or 2, I have been looking at what other major projects do. Since both Python and Homebrew seem to have chosen 2, I figured it would be a reasonable choice for Xpra too.

totaam commented 1 month ago

Makes sense! My bad. I thought the headers (build time - controlled by us) needed to match the runtime binary (provided by the system) as that's what normally happens.