pulsar-edit / pulsar

A Community-led Hyper-Hackable Text Editor
https://pulsar-edit.dev
Other
3.24k stars 137 forks source link

[superstring] Building on newer macOS requires Apple’s version of GNU `libiconv` #1045

Closed savetheclocktower closed 2 months ago

savetheclocktower commented 2 months ago

Thanks in advance for your bug report!

What happened?

I'm going to try to knowledge-dump into this issue, since our superstring repo doesn't have issues enabled:

  1. macOS 13 and newer ships a version of libiconv that has API incompatibilities with what superstring expects.
  2. This appears to affect both Intel and Apple Silicon, but we're encountering it first in Apple Silicon because CircleCI was the first to force us onto macOS 13 as a minimum version.
  3. We thought we could get around this via brew install libiconv, then adding the appropriate paths to LDFLAGS and CPPFLAGS before running npm install
  4. …but this only works on the local machine, since it's dynamically linking to libiconv. The built binary isn't portable because it requires libiconv.dylib to be present at the exact path we specified during the build process.
  5. Indeed, I was able to reproduce this locally by temporarily renaming the directory into which Homebrew had installed libiconv. As soon as the library wasn't present where Pulsar expected it to be, it failed to start properly.
  6. We need to statically link to libiconv — i.e., include it in the build. I think I was able to make this happen by copying the syntax from this binding.gyp file I found via a GitHub code search. When I change my local node_modules/superstring/binding.gyp and run node-gyp rebuild — and then go back to the project root and run yarn build to trigger an electron-rebuild — I end up with a local build of Pulsar that works just fine even if I rename my local libiconv directory.

So at this point, I believe I know the solution. I'd like to do a bit more research on the best way to represent this complexity in the binding.gyp file. I imagine it should be an opt-in thing with a config flag, but the bottom line is that, in order to build superstring on macOS >=13, you'll have to provide the path to a compatible version of libiconv so that we can statically link against it.

Right now I don't know:

My tentative plan to get us unblocked is to have superstring’s binding.gyp check the value of an environment variable. That's probably not the best long-term solution, but it feels like it would work pretty well in CI and would allow us to get Apple Silicon binaries out the soonest.

Pulsar version

(all)

Which OS does this happen on?

🍎 macOS

OS details

=13

Which CPU architecture are you running this on?

None

What steps are needed to reproduce this?

(be on macOS >=13)

Additional Information:

No response

savetheclocktower commented 2 months ago

OK, here's my research so far.

How do we fix this?

Does this replace the existing conditional binding.gyp configuration on macOS?

It certainly seems to. I can comment out that entire section and not have it affect the outcome of my rebuilding of superstring, nor its inclusion into Pulsar.

How can we prove this works in the superstring repo?

Not sure yet.

The build:node task works just fine, but the test:native task gets stuck on another issue:

SOLINK_MODULE(target) Debug/superstring.node
CXX(target) Debug/obj.target/tests/test/native/test-helpers.o
CXX(target) Debug/obj.target/tests/test/native/tests.o
In file included from ../test/native/tests.cc:5:
In file included from ../vendor/catch.hpp:76:
In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/v1/sstream:273:
In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/v1/istream:170:
In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/v1/ostream:172:
In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/v1/__system_error/error_code.h:18:
In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/v1/__system_error/error_category.h:15:
In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/v1/string:622:
In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/v1/string_view:1059:
In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/v1/algorithm:1893:
In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/v1/__algorithm/ranges_sample.h:13:
In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/v1/__algorithm/sample.h:18:
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/v1/__random/uniform_int_distribution.h:237:5: error: static assertion failed due to requirement '__libcpp_random_is_valid_urng<Catch::RandomNumberGenerator, void>::value':
  static_assert(__libcpp_random_is_valid_urng<_URNG>::value, "");
  ^             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/v1/__algorithm/shuffle.h:154:35: note: in instantiation of function template specialization 'std::uniform_int_distribution<long>::operator()<Catch::RandomNumberGenerator>' requested here
          difference_type __i = __uid(__g, _Pp(0, __d));
                                ^
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/v1/__algorithm/shuffle.h:166:14: note: in instantiation of function template specialization 'std::__shuffle<std::_ClassicAlgPolicy, std::__wrap_iter<Catch::TestCase *>, std::__wrap_iter<Catch::TestCase *>, Catch::RandomNumberGenerator &>' requested here
(void)std::__shuffle<_ClassicAlgPolicy>(
           ^
../vendor/catch.hpp:6483:18: note: in instantiation of function template specialization 'std::shuffle<std::__wrap_iter<Catch::TestCase *>, Catch::RandomNumberGenerator &>' requested here
          std::shuffle( vector.begin(), vector.end(), rng );
               ^
../vendor/catch.hpp:6501:44: note: in instantiation of function template specialization 'Catch::RandomNumberGenerator::shuffle<std::vector<Catch::TestCase>>' requested here
                  RandomNumberGenerator::shuffle( sorted );
                                         ^
1 error generated.
make: *** [Debug/obj.target/tests/test/native/tests.o] Error 1

I think this is another symptom of a library incompatibility: the test helpers use #include <random>, but it looks like the built-in version of random has some API incompatibilities. Not sure how to fix this one.

How can we know that this fixes the issue in Pulsar?

From the Pulsar repo’s root directory, I was able to

and get Pulsar to run.

I quit the app, renamed my Homebrew libiconv directory, then launched the app again; it worked fine. It was not flummoxed by the new libiconv location.

(I did have to wait until after yarn build before I could rename libiconv.)

My next step would’ve been to use npm link to prove that my local changes to superstring could be adopted in Pulsar just by bumping our fork’s version and using telling Pulsar to use our superstring fork in package.json. Annoyingly, I can’t get npm link superstring to work correctly; here’s the relevant failure deep into the linking process:

npm ERR! code 1
npm ERR! git dep preparation failed
npm ERR! command /Users/andrew/.asdf/installs/nodejs/16.19.0/bin/node /Users/andrew/.asdf/installs/nodejs/16.19.0/lib/node_modules/npm/bin/npm-cli.js install --force --cache=/Users/andrew/.npm --prefer-offline=false --prefer-online=false --offline=false --no-progress --no-save --no-audit --include=dev --include=peer --include=optional --no-package-lock-only --no-dry-run
npm ERR! > @mdn/content@1.0.0 postinstall
npm ERR! > node scripts/update-history.js
npm ERR!
npm ERR! [ 'rev-parse', '--show-toplevel' ]
npm ERR! Error running git rev-parse,--show-toplevel
npm ERR! npm WARN using --force Recommended protections disabled.
npm ERR! npm WARN deprecated npmlog@2.0.4: This package is no longer supported.
npm ERR! npm WARN deprecated inflight@1.0.6: This module is not supported, and leaks memory. Do not use it. Check out lru-cache if you want a good and tested way to coalesce async requests by a key value, which is much more comprehensive and powerful.
npm ERR! npm WARN deprecated glob@7.2.3: Glob versions prior to v9 are no longer supported
npm ERR! npm WARN deprecated gauge@1.2.7: This package is no longer supported.
npm ERR! npm WARN deprecated are-we-there-yet@1.1.7: This package is no longer supported.
npm ERR! npm WARN deprecated uuid@3.4.0: Please upgrade  to version 7 or higher.  Older versions may use Math.random() in certain circumstances, which is known to be problematic.  See https://v8.dev/blog/math-random for details.
npm ERR! <Buffer 66 61 74 61 6c 3a 20 6e 6f 74 20 61 20 67 69 74 20 72 65 70 6f 73 69 74 6f 72 79 20 28 6f 72 20 61 6e 79 20 6f 66 20 74 68 65 20 70 61 72 65 6e 74 20 ... 19 more bytes>
npm ERR! file:///Users/andrew/.npm/_cacache/tmp/git-cloneGWgpC5/scripts/utils.js:39
npm ERR!     throw new Error(
npm ERR!           ^
npm ERR!
npm ERR! Error: git command failed: fatal: not a git repository (or any of the parent directories): .git
npm ERR!
npm ERR!     at execGit (file:///Users/andrew/.npm/_cacache/tmp/git-cloneGWgpC5/scripts/utils.js:39:11)
npm ERR!     at getRootDir (file:///Users/andrew/.npm/_cacache/tmp/git-cloneGWgpC5/scripts/utils.js:47:10)
npm ERR!     at execGit (file:///Users/andrew/.npm/_cacache/tmp/git-cloneGWgpC5/scripts/utils.js:20:27)
npm ERR!     at file:///Users/andrew/.npm/_cacache/tmp/git-cloneGWgpC5/scripts/update-history.js:12:16
npm ERR!     at ModuleJob.run (node:internal/modules/esm/module_job:193:25)
npm ERR!     at async Promise.all (index 0)
npm ERR!     at async ESMLoader.import (node:internal/modules/esm/loader:530:24)
npm ERR!     at async loadESM (node:internal/process/esm_loader:91:5)
npm ERR!     at async handleMainPromise (node:internal/modules/run_main:65:12)
npm ERR! npm ERR! code 1
npm ERR! npm ERR! path /Users/andrew/.npm/_cacache/tmp/git-cloneGWgpC5
npm ERR! npm ERR! command failed
npm ERR! npm ERR! command sh -c -- node scripts/update-history.js
npm ERR!
npm ERR! npm ERR! A complete log of this run can be found in:
npm ERR! npm ERR!     /Users/andrew/.npm/_logs/2024-07-01T20_47_48_949Z-debug-0.log

npm ERR! A complete log of this run can be found in:
npm ERR!     /Users/andrew/.npm/_logs/2024-07-01T20_44_39_165Z-debug-0.log

This doesn’t seem to be related to superstring at all. I hope it’s not an issue when I try to bump superstring for real, but we’ll find out. I’ll first try it locally by pointing the superstring dependency to a SHA on my remote fork.

How does this affect the “building Pulsar” instructions?

Instructions for building Pulsar on Windows and Linux would be unaffected.

Instructions for building Pulsar on macOS would be changed as follows:

On macOS version 13 and later, the default implementation of libiconv is incompatible with superstring, our core text-editing library. You may either

  • Install Homebrew, then install its version of libiconv: brew install libiconv; or
  • Use another installation method for libiconv, then define an environment variable called SUPERSTRING_LIBICONV_PATH. The value should be the path in which the iconv header files live; it will be passed to your compiler via the standard -I switch.

The Homebrew approach is recommended for ease of use, but the environment variable takes precedence, so it can be used if you truly know what you’re doing. The yarn install and yarn build commands will run a script that tries to infer a libiconv include path using these techniques. If you use the environment variable, make sure that both these commands can “see” it.

Rough plan

savetheclocktower commented 2 months ago

I should also point out that our eventual fix here could be to vendorize libiconv, just like we do for the Windows version. But I don't know much about how that's done.

savetheclocktower commented 2 months ago

Crashing and burning at this point; everything worked great until I tried to move my libiconv directory, at which point things failed. Even a binary I built with yarn dist failed.

I think that I might also need to migrate text-buffer’s superstring dependency to my fork, but for some reason I can't get it working right on a fork of text-buffer. It gets pretty far, but then node-gyp bails with

npm ERR! /bin/sh: script/find-libiconv-include.sh: No such file or directory

because my script directory isn't present. I don't understand why.

I'll probably have to put this away for the day, but I'll try to return to it tomorrow.

savetheclocktower commented 2 months ago

None of what I said above seems to work anymore and I'm not entirely sure why it ever worked in the first place.


There is a method that seems to work locally, though I haven't tried to build binaries with it yet.

What's frustrating about our situation is that we have a copy of the library we want macOS to use. But because it's a dynamic library, it wants to link against it and leave it where it is. We, on the other hand, want it to claim that library for its own and carry it around, since we can't guarantee that the user's machine will have that version.

This should be easy. It's a file, right? We need to reference its headers when we compile superstring — otherwise it won't know what <iconv.h> means — but once it's built, we just need to carry around libiconv.dylib with us. (Actually libiconv.2.dylib, since that's what libiconv.dylib is symlinked to.)

But macOS wants to reuse dynamic libraries aggressively. This was fine when we were willing to use the built-in libiconv.dylib, but is working against us now that we want our own copy. macOS indexes dynamic libraries by an “install name” that, by default, is set to the absolute path of its original/intended installation location. Hence the libiconv.dylib given to us by Homebrew has an install name of /usr/local/opt/homebrew/lib/libiconv.2.dylib.

We've got two copies of superstring.node at different places in our node_modules folder. We've got one as a direct dependency (since we require('superstring') in a couple of places to use its Node API directly) and one as a transitive dependency of text-buffer. (These shouldn't be two different things, but we'll cross that bridge later.) Each of them wants to use libiconv.dylib, but we want them to find our copy — not to go looking at an absolute path on the end user's machine.

For each one, we can run otool -L to see the libraries it references and their install names:

$ otool -L ./node_modules/superstring/build/Release/superstring.node
./superstring.node:
    /usr/local/opt/libiconv/lib/libiconv.2.dylib (compatibility version 9.0.0, current version 9.1.0)
    /usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 1700.255.0)
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1345.100.2)

The version of libiconv we installed from Homebrew? Suppose we stole its libiconv.2.dylib and placed it in ./node_modules/superstring/vendor/libiconv/lib/. This won't work on its own because no amount of monkeying with binding.gyp (that I've seen so far) would allow us to specify where superstring.node should go looking for this library at runtime.

But we could then run this command:

$ install_name_tool -change /usr/local/opt/libiconv/lib/libiconv.2.dylib @loader_path/../../vendor/libiconv/lib/libiconv.2.dylib ./node_modules/superstring/build/Release/superstring.node

The format here is install_name_tool -change [old] [new] [target].

@loader_path is a magical token that lets us resolve a library relative to the thing that's asking for it — in this case, relative to superstring.node. Now we see:

$ otool -L ./node_modules/superstring/build/Release/superstring.node
./superstring.node:
    @loader_path/../../vendor/libiconv/lib/libiconv.2.dylib (compatibility version 9.0.0, current version 9.1.0)
    /usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 1700.255.0)
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1345.100.2)

The other copy of superstring.node is deeper and therefore more annoying:

$ install_name_tool -change /usr/local/opt/libiconv/lib/libiconv.2.dylib @loader_path/../../../../../superstring/vendor/libiconv/lib/libiconv.2.dylib ./node_modules/text-buffer/node_modules/superstring/build/Release/superstring.node

(If we update both superstring dependencies to a version that includes a vendorized libiconv.2.dylib, then we could have each one point to “its own” libiconv.2.dylib, but this should also work.)

Once I do all of these things, it works. I can run pulsar --dev with ATOM_DEV_RESOURCE_PATH set and it doesn't go looking in my Homebrew directory for libiconv. I can rename that directory, quit and relaunch Pulsar, and have it all keep working.


Remember: we thought we had fixed this a few nights ago, but then an Apple Silicon user reported that they had encountered this error on first launch of a newly-produced Apple Silicon build:

index.js:83 Error: dlopen(/var/folders/3j/kgxdtpv96wz6wjkt8_m3mzwc0000zw/T/.dev.pulsar-edit.pulsar.OzB36c, 0x0001): Library not loaded: /opt/homebrew/opt/libiconv/lib/libiconv.2.dylib
  Referenced from: <E8389490-8E9A-3BAD-8896-D0AF570DB39D> /private/var/folders/3j/kgxdtpv96wz6wjkt8_m3mzwc0000zw/T/.dev.pulsar-edit.pulsar.OzB36c
  Reason: tried: '/opt/homebrew/opt/libiconv/lib/libiconv.2.dylib' (no such file), '/System/Volumes/Preboot/Cryptexes/OS/opt/homebrew/opt/libiconv/lib/libiconv.2.dylib' (no such file), '/opt/homebrew/opt/libiconv/lib/libiconv.2.dylib' (no such file)

The things it “tried” are file paths. But they are also install names. I realized as much when I was screwing up the install_name_tool syntax; if I got the “new” install name wrong, and named it foo instead of what I meant to name it, the error would come back: Reason: tried: 'foo' (no such file). When I used @loader_path as part of the new name, the error message helpfully expanded it for me, helping me realize exactly where it was starting its search on the filesystem.

The good news, theoretically, is that none of the work we did the other night needs to be backed out; we could write a script that fixes the install names for both instances of superstring.node. Or we could possibly do this step as a post-install task on the superstring repo itself; that's probably the “correct” thing to do here, but would take longer to figure out.


This is insane and I'm still 95% sure that there's a simpler way to do this. But this exists and is documented and I have proven that it works when running Pulsar with ATOM_DEV_RESOURCE_PATH set. I have not tried to build binaries with this approach yet, but I'll give it a shot tomorrow. My goal is not “let's add a ridiculous thing we barely understand to our build process for eternity”; it's just to keep this beautiful project spinning and shipping working binaries.

savetheclocktower commented 2 months ago

This won't work on its own because no amount of monkeying with binding.gyp (that I've seen so far) would allow us to specify where superstring.node should go looking for this library at runtime.

Once again, GitHub code search solves a problem for me. There are many examples of binding.gyp files in the wild that invoke install_name_tool -change as a post-build task, so that's probably the best way to go here.

savetheclocktower commented 2 months ago

I should also point out that our eventual fix here could be to vendorize libiconv, just like we do for the Windows version. But I don't know much about how that's done.

This is practically what we're doing now, though we'll probably switch to a more proper implementation of this with a git subtree like the Windows version.

The Homebrew version of libiconv complained about not recognizing utf8 as a valid encoding name — which, I mean, fair point! It's not! But we sought out Apple's version instead just so it could be a drop-in replacement.

Then, when faced with random crashes in Linux CI, we got paranoid and created a branch on our superstring fork that perfectly matched the last version published to NPM. Then we added on the macOS support, the vendorization of win_iconv, and the CI changes — none of which could've themselves been the causes of crashing. That got us to stability.

That means we'll be able to build binaries for all platforms — even Intel Mac if GitHub forces us onto a newer macOS version. But it also means we should proceed more cautiously with changes to superstring! Even a change that passes superstring's tests could cause problems in Pulsar, so we should probably add some Pulsar smoke tests to superstring’s CI process.

Ideally, as we try to migrate it to N-API, we'd be able to run CI against both Pulsar-on-Electron-12 and Pulsar-on-latest-Electron.

savetheclocktower commented 2 months ago

We fixed this!

My musings from the last comment have been captured in superstring#14.