Closed DavisVaughan closed 1 year ago
I was able to reproduce the issue on my arm64 macOS machine when building through Rosetta; the issue was apparently resolved when I switched to use an x86_64 Rust toolchain. Maybe we need to do that on CI?
We are already doing that on CI via RUST_TARGET
:
Did you do something more than that locally?
Ah, I think I know what the missing piece is -- we need to make sure that the x86_64 build of R is on the PATH, or do something to make sure that it gets selected by the libR-sys build script.
Otherwise, libR-sys tries to link against the arm64 build of R, and that silently fails as we're linking with -undefined dynamic_lookup
. Unfortunately, the linker warning is swallowed into the ether (or perhaps it decided to run away from home and live in the woods) so we don't get any indication that the thing we've built isn't functional.
See also: https://github.com/extendr/libR-sys/blob/master/build.rs#L176-L192
The simplest fix might be to set R_HOME
while building ark
, since libR-sys seems to use that.
Oh, that would make sense!
There is, in fact, no x86_64 build of R on the macOS build host. This all but confirms Kevin's theory that the problem is that the x86_64 build of ark is not linking to the x86_64 build of R.
I followed Kevin's excellent instructions to set up x86_64 Homebrew on R. Unfortunately, it is only possible to get R 4.3.0 using this method, and we cannot compile against R 4.3.0 due to https://github.com/rstudio/positron/issues/566.
Next I tried installing an older R from CRAN using its .pkg
based installer. Unfortunately, installing the .pkg
without root access fails without an error, even when the install target is CurrentUserHomeDirectory
.
Finally, I tried getting an older R version from another build host we have (la962) that had an older R installed via homebrew a while back (R 4.1). Unfortunately, this did not work either, because a great number of paths inside scripts, dynamic linker paths embedded in executables, etc. referred to the paths on the other host. It's probably not impossible to patch it up to work on our current host, but it's a large task.
We should consider this issue to be blocked on the resolution of https://github.com/rstudio/positron/issues/566. Once that issue is resolved, we can use R 4.3.0 from Homebrew x86_64 along with R_HOME
.
I guess rig doesn't "just work" for this? It may also need sudo, not sure
brew tap r-lib/rig
brew install --cask rig
rig add -a x86_64 release # or rig add -a x86_64 4.2
Yes, you need root privilege to install rig
(i.e. the second step prompts for the sudo password). :-(
A fifth option might be to build R from source right on the machine...
A sixth option: switch from MacinCloud to MacinCloset? Or some other host that gives us a dedicated macOS server instance where we have root access? We've already burned enough engineering time on MacinCloud, and the hacks we keep accumulating do not seem sustainable long-term.
My recollection is that the IDE team was working on IT with this since they effectively have the same problem. I'll ping them to get a status on https://github.com/rstudio/rstudio-pro/issues/3068 which was the tracking issue.
Okay, now we can use R 4.3 thanks to @romainfrancois.
The next roadblock is that I can't install R on x86_64 Homebrew -- its dependency libx11
is broken. It has to be built from source (since the homebrew prefix differs from the precompiled builds), and compiling it from source results in:
/Users/user229818/homebrew-x86_64/Cellar/xtrans/1.5.0/include/X11/Xtrans/Xtranslcl.c:81:11: fatal er
ror: 'stropts.h' file not found
# include <stropts.h>
^~~~~~~~~~~
1 error generated.
make[3]: *** [xim_trans.lo] Error 1
It looks like someone else also reported this issue with libx11
in Homebrew: https://stackoverflow.com/questions/76419113/unable-to-install-libx11-on-macos
and indeed it has been reported upstream in Homebrew: https://github.com/Homebrew/homebrew-core/pull/132483
I stumbled across this in my googling about this issue https://github.com/maxim-belkin/homebrew-xorg/issues/453
With this patch https://github.com/bluemage650/homebrew-xorg/commit/0153e00ea5c9cfb037215820b660b7d64dcd38c5
May or may not be useful
Homebrew finally fixed this yesterday:
https://github.com/Homebrew/homebrew-core/commit/38c920d6de4882a2c1791a5ba84e16154d752aad
so I got our side of the work done and merged. However, I don't have an Intel mac so could not test those bits; I'm marking this as ready for review so someone who does have an Intel mac can take a look.
To test, you need a build > 2023.06.0-1505
I downloaded Positron-2023.06.0-1514-darwin-universal.zip and am seeing this same message on startup. Do I need to update anything on my end re:homebrew, etc?
Oddly, the text under console is just "Use" and the "Start Interpreter" button has no response. I tried to go through the "Start Extension Bisect" and "Restart Extension Host" workflows offered, but that didn't seem to change anything.
It looks like our universal build of ark is still missing the link to x86_64 R:
$ otool -L ark
ark (architecture x86_64):
/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 1953.255.0)
/System/Library/Frameworks/CoreServices.framework/Versions/A/CoreServices (compatibility version 1.0.0, current version 1228.0.0)
/usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 1300.36.0)
/usr/lib/libiconv.2.dylib (compatibility version 7.0.0, current version 7.0.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1319.0.0)
/usr/lib/libresolv.9.dylib (compatibility version 1.0.0, current version 1.0.0)
ark (architecture arm64):
/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 1953.255.0)
/System/Library/Frameworks/CoreServices.framework/Versions/A/CoreServices (compatibility version 1.0.0, current version 1228.0.0)
/usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 1300.36.0)
@rpath/libR.dylib (compatibility version 4.2.0, current version 4.2.2)
/usr/lib/libiconv.2.dylib (compatibility version 7.0.0, current version 7.0.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1319.0.0)
/usr/lib/libresolv.9.dylib (compatibility version 1.0.0, current version 1.0.0)
In the logs of the release build workflow i see
Installing dependencies in extensions/positron-r...
$ yarn --ignore-engines
yarn install v1.22.19
[1/4] Resolving packages...
[2/4] Fetching packages...
[3/4] Linking dependencies...
[4/4] Building fresh packages...
$ ts-node scripts/post-install.ts
yarn run v1.22.19
warning positron-r@0.0.2: The engine "vscode" appears to be invalid.
$ ts-node scripts/compile-kernel.ts
Finished release [optimized] target(s) in 0.[71](https://github.com/rstudio/positron/actions/runs/5450186302/jobs/9915193782#step:4:72)s
Done in 1.34s.
Done in 4.90s.
where Finished release [optimized] target(s) in 0.[71]
seems suspiciously fast for compiling amalthea and its deps.
Could this be using a cached version of libR-sys?
Interestingly if I download the Universal app from https://github.com/rstudio/positron/releases/tag/2023.06.0-1576 then on my intel mac i see
davis@daviss-mbp-2 bin % otool -L ark
ark:
/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 1953.255.0)
/System/Library/Frameworks/CoreServices.framework/Versions/A/CoreServices (compatibility version 1.0.0, current version 1228.0.0)
/usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 1300.36.0)
/usr/lib/libiconv.2.dylib (compatibility version 7.0.0, current version 7.0.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1319.0.0)
/usr/lib/libresolv.9.dylib (compatibility version 1.0.0, current version 1.0.0)
But if I download the x86 artifact from the same commit hash here https://github.com/rstudio/positron/actions/runs/5450186302
then I see
davis@daviss-mbp-2 bin % otool -L ark
ark:
/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 1953.255.0)
/System/Library/Frameworks/CoreServices.framework/Versions/A/CoreServices (compatibility version 1.0.0, current version 1228.0.0)
/usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 1300.36.0)
@rpath/libR.dylib (compatibility version 4.3.0, current version 4.3.1)
/usr/lib/libiconv.2.dylib (compatibility version 7.0.0, current version 7.0.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1319.0.0)
/usr/lib/libresolv.9.dylib (compatibility version 1.0.0, current version 1.0.0)
But even with the x86 specific build I see that "extension host terminated unexpectedly" which makes me think maybe that is separate
I managed to reproduce Extension host terminated 3 times...
with the positron-r
and positron-python
extensions disabled.
I then tried removing the jupyter-adapter
extension and then it goes away, which tells me that we also have issues here.
In the developer tools, I saw this in the release builds, but struggled to get a backtrace for it
After much effort, I have managed to actually reproduce this in a dev build on Positron on my ARM mac by running yarn
in an x86_64 zsh shell with an x86_64 version of node installed, and then hitting |> Positron
. I believe that rebuilt all of the node modules with x86_64.
Then I was able to set a breakpoint in Positron after the extension host started up, but before the problematic extension was started, and from there I used Help -> Open Process Explorer
to figure out which PID went with the extension host.
Once I had the PID, I started a second shell with lldb attached to that PID, then hit "continue" in Positron. Once it crashed, I finally got more details about that backtrace
So this tells me it is zeromq's random_open()
function that is causing the error - and if we look into it, we see that that is calling sodium_init()
https://github.com/zeromq/libzmq/blob/ecc63d0d3b0e1a62c90b58b1ccdb5ac16cb2400a/src/random.cpp#L39-L63
This checks out, because jupyter-adapter
is where our version of zeromq lives.
zeromq itself is built with x86_64 (I checked), but we use a custom version of it that tries to statically link to a local copy of libsodium. https://github.com/zeromq/zeromq.js/compare/master...kevinushey:zeromq.js:master
It seems that the version of libsodium it finds on my machine is ARM based, causing the missing symbol error.
I imagine this is also happening on CI, so it seems like we need a way to tell zeromq to link to a version of libsodium built with the right architecture, but I'm not sure how to do that.
I believe I have all of this fixed in https://github.com/rstudio/positron/pull/820. As seen in https://github.com/rstudio/positron/pull/820#issuecomment-1631388275, the release builds that can be generated from this PR successfully start on an Intel Mac!
@isabelizimm Could you review this since you have an Intel mac?
FWIW I have resurrected my 2015 MacBook Pro, running macOS Monterey (12.6.7), with the CRAN installation of R 4.3.1, and I successfully installed and launched the artifact positron-darwin-universal-archive
from https://github.com/rstudio/positron/actions/runs/5523651576. I can interact with the R interpreter.
OK, good enough for us to close this. Thanks @jennybc!
If I download the universal (or x86 specific) daily build, it opens Positron but I immediately get ark failures where Positron states that the extension failed to open. Notably, if I build Positron locally myself, then everything "just works".
In particular I downloaded this build: https://github.com/rstudio/positron/releases/tag/2023.05.0-1413
We think the problem is that I am on an Intel Mac, while the rest of the dev team is on an ARM Mac, and something is wrong related to that.
When I open Positron I see:
And in the R console I see:
Which suggests some linking issue?
Kevin had the idea to run
otool -L ark
on the release build and compare to my local build:Release:
Local:
Note that
/Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libR.dylib
is in my local build but not in the release build