pulsar-edit / superstring

Native core components for Pulsar
MIT License
1 stars 5 forks source link

Vendorize `libiconv` on macOS #11

Closed savetheclocktower closed 3 months ago

savetheclocktower commented 3 months ago

After a long struggle with lots of wrong turns, I think this is how we fix things.

The root problem

On macOS 13 (or 14; I've lost track now), the system version of libiconv switched from the GNU implementation to the FreeBSD implementation. There are API incompatibilities between the two; this presents a problem for superstring, which previously was happy to rely on the ambient libiconv.dylib already present on the default library load path.

When I first noticed this problem back in January, I didn't investigate the root cause; it wasn't affecting our binaries yet because we were pinned to older macOS versions in CI. I just knew that I could fix it locally (i.e., when running from source via ATOM_DEV_RESOURCE_PATH) by installing Homebrew's version of libiconv and making sure its paths were present in my LDFLAGS and CPPFLAGS environment variables.

So when CirrusCI automatically upgraded our macOS images and our Apple Silicon builds started failing with the same iconv errors, I figured the solution was the same. We modified the CI job to run brew install libiconv, modify LDFLAGS and CPPFLAGS, and proceed as usual. That was enough to get Pulsar to compile, so we thought the job was done.

The second problem

Except we hadn't solved anything. We'd gotten Pulsar to compile by telling it about a dynamic library that was present on the CI machine; but all that did was lead Pulsar to assume that that library would be present on everyone's machine. So we ended up producing a version of Pulsar that would refuse to run unless you had libiconv present at /opt/local/opt/libiconv. (That isn't even the standard brew --prefix, so a local Homebrew installation of libiconv would still probably fail this test.)

The fix

A day's worth of flailing ensued, interrupted by occasional research. The most fruitful results were this blog post (which summarized the dilemma rather well and hinted at the eventual solution), followed by this blog post (exposing me to this strange install_name_tool utility and its ability to resolve relative paths).

Then, the breakthrough: searching GitHub for ways that other libraries use binding.gyp and install_name_tool. (That search query isn't working right now, so I'm glad it was working last night.)

Here's the short version:

So our first task is to vendorize libiconv. We were already doing this for Windows; now we're doing it for macOS as well.

Since the vendored files we need are platform-specific, we can copy the files we need from the user's Homebrew installation or from a path that the user provides via an environment variable. We do this as a pre-compile task on macOS only.

In our case, the thing that needs libiconv.dylib is the built superstring.node module. So as a post-compile task, we make sure that the module we just built knows where to find our vendorized copy of libiconv.dylib. The @loader_path magic token is what tells it to search relative to itself.

This should get superstring working on macOS again — not just compiling but also running in a self-contained manner. I've been testing this by temporarily moving the Homebrew version of libiconv to a different place on my local machine just to confirm it doesn't confuse superstring. And the effects of install_name_tool seem to survive a trip through electron-rebuild, so this isn't even something that the pulsar repo needs to be aware of.

How we proceed

I'm fine with landing this however we want to. We can land it on master and tag a new release; we can land it on a branch; doesn't matter much. In the short term, the goal is to be able to modify Pulsar’s package.json to point superstring to a tag/SHA on our repo instead of a version number on NPM. If we do that — and modify the resolutions so that text-buffer uses that same copy for its superstring dependency — then everything should work, and we'll be able to build Apple Silicon binaries again on macOS.

(The same problem would've eventually struck us on the Intel Mac side, too. This way we get out ahead of it.)

Testing

For sanity's sake, I'd love your help testing this no matter which platform you're on. If you're on Windows or Linux, it'd be great to know for sure that I haven't broken anything (though CI should help there).

Any macOS users can try the steps below. If you're on an Intel Mac… well, so am I, but it'd still be good to have another data point. If you're on Apple Silicon, I'm especially interested in your experience.

savetheclocktower commented 3 months ago

It'd certainly be nice to get CI working on macOS, but that's a pre-existing issue and is probably less urgent than our need to get binaries building again.

savetheclocktower commented 3 months ago

Just realized that I've probably vendorized some things that would actually need to be built on a per-platform basis. We could get around this by continuing to rely on the presence of Homebrew libiconv, but running some sort of pre-build action that finds said Homebrew installation, copies the files to where we want them to be, then runs install_name_tool to rename the library file.

I'll look at the Homebrew recipe and try to understand it a bit better. We could vendorize libiconv in unbuilt form and then build it ourselves for the target platform, but then we're basically doing Homebrew's job, except worse.

savetheclocktower commented 3 months ago

I did a test PR on the pulsar repo with the superstring dependency changed to point to the head of this PR. I was able to produce seemingly working builds for both Intel Mac and Apple Silicon.

I tested the Intel Mac binary and it worked fine; I say “seemingly” here because CI produces an unsigned build for PRs. I have no reason to doubt that code signing would work.

@meadowsys is our guinea pig for Apple Silicon builds, so I'll update when I hear back from them. I did notice this in the CirrusCI build logs…

warning: changes being made to the file will invalidate the code signature in: /opt/homebrew/opt/libiconv/lib/libiconv.2.dylib

…and I believe it would've happened when we used install_name_tool to change that .dylib’s install name. If this ends up causing problems, I think I could manage to work around this and make that step unnecessary.

But then the code signing of the app itself appeared to work properly, so I'm hoping that's all we need. The post-build Playwright smoke tests also ran smoothly.

(Edit: and if we do need to re-sign the .dylib, it might be as simple as this.)

DeeDeeG commented 3 months ago

I can vouch that, on my macOS machine (macOS Catalina 10.15.7, intel x86_64 processor), the steps under the Testing section of the PR text above worked as the instructions said they should.

As for getting CI working, I've got a couple of commits here to make a bit of progress on that: https://github.com/DeeDeeG/superstring/commits/472c6db1c05879cad9600a357fda1a2082bdf056 (Gets Python setuptools installed again successfully.)

Would also need to modify the test matrix so Node 14 isn't tested on macos-latest runners... (which are implicitly macos-14 now, which for free tier always runs on ARM64/"Apple Silicon" hosts in GitHub Actions...) since Node 14 never got compatibility with "Apple Silicon" (ARM64 macOS), if I understand correctly. We can run Node 14 tests on macos-12 or macos-13 runners for now.

savetheclocktower commented 3 months ago

I addressed some feedback on Discord by making sure we copy libiconv’s license file and README to the vendor/libiconv directory.

@meadowsys reported on Discord that the Apple Silicon binary works fine, which… was a bit surprising to me? I was pretty sure it was going to complain about the code signature on libiconv.2.dylib — so sure that I had started to adopt a different approach to avoid changing that library's install name. I won't mess with that we've got if it's working, but I will check that approach into the repo just in case we need it in the future.

savetheclocktower commented 3 months ago

Anyway, here's how we could proceed:

At that point, it'd be great to use this whole ordeal as an excuse to revisit #9 and #10 — neither of which is crucial in the short-term, but both of which would be nice to address. I think it's a good idea to start inching closer to the parts of Pulsar that make us nervous: superstring, other native modules formerly owned by the Atom team, and finally text-buffer.

savetheclocktower commented 3 months ago

Closing in favor of #15.