Closed zbentley closed 8 months ago
It’s been a while since I worked on this, so I could have the details wrong. Also, I don’t have an ARM Mac to test this on, which makes it harder to track down the issue myself.
I think the issue is the rustup
wrappers, e.g. ~/.cargo/bin/cargo
, are being installed using the “host” architecture, i.e. x84 because it’s running in Rosetta. The toolchains should still be installed for the architecture you specified, though the symptoms you report suggest that might not be happening — I would assume that if something running under Rosetta ran an ARM binary, then the ARM binary ran a universal binary, that it would choose the ARM version.
What does file ~/.rustup/toolchains/nightly-aarch64-apple-darwin/bin/*
give you?
Probably the best way to solve the rustup
architecture problem is to allow choosing a specific rustup-init
(see Other ways to install rustup
). The downside is that it doesn’t make sense to use that on the main path — reimplementing the installation shell script that figures out the architecture string would be too fragile.
Thanks for the replies!
if something running under Rosetta ran an ARM binary, then the ARM binary ran a universal binary, that it would choose the ARM version.
I'd expect that too! But it does not seem to be occurring; it seems like if anything up the chain has been invoked as x86, even without an explicit arch
override to do so, then future universal binaries will be launched as x86 even if there are intermediate ARM binaries launched (because they're non-universal and only support ARM). I really don't like that behavior from Rosetta/Apple, it seems quite counterintuitive, but ... what can you do 🤷
What does file ~/.rustup/toolchains/nightly-aarch64-apple-darwin/bin/* give you?
When rustup-init
is called by this module via x86 bolt, all binaries in that folder are x86.
When rustup-init
is run by me (on a clean/non-initted system) from an ARM Bash/Zsh shell binary, all binaries in that folder are ARM.
Probably the best way to solve the rustup architecture problem is to allow choosing a specific rustup-init
I don't think that'll change anything, sadly. I think that, so long as Bolt's x86-only (which it is for now), regardless of how rustup-init
is run, unless whatever runs it wraps it in arch -arch $native_arch rustup-init ...
, this issue will reoccur.
I don’t have an ARM Mac to test this on, which makes it harder to track down the issue myself.
I'll take a stab at a PR (my Ruby is rusty though) in the coming weeks and you can decide if the arch -arch
approach seems feasible or not.
What does file
~/.rustup/toolchains/nightly-aarch64-apple-darwin/bin/*
give you?When rustup-init is called by this module via x86 bolt, all binaries in that folder are x86.
That sounds like a bug in this module or maybe in rustup
itself (unlikely, I think) — the toolchain should always be in the architecture in its name. For example, on my Intel Mac:
❯ rustup toolchain install --no-self-update --force-non-host nightly-aarch64-apple-darwin
. . .
❯ cd ~/.rustup/toolchains
❯ file nightly-*-apple-darwin/bin/cargo
nightly-aarch64-apple-darwin/bin/cargo: Mach-O 64-bit executable arm64
nightly-x86_64-apple-darwin/bin/cargo: Mach-O 64-bit executable x86_64
Similarly, if I install the stable-x86_64-pc-windows-gnu
toolchain, then the corresponding toolchain binaries are Windows executables.
Probably the best way to solve the rustup architecture problem is to allow choosing a specific
rustup-init
I don't think that'll change anything, sadly. I think that, so long as Bolt's x86-only (which it is for now), regardless of how
rustup-init
is run, unless whatever runs it wraps it inarch -arch $native_arch rustup-init ...
, this issue will reoccur.
I figure that the architecture of rustup-init
is what determines the architecture of ~/.cargo/bin/cargo
. That binary should just determine which toolchain is desired, then exec
the correct cargo
binary in ~/.rustup/toolchains/
. Since ~/.rustup/toolchains/nightly-aarch64-apple-darwin/bin/cargo
is x86_64 for you, I don't think that’s the (immediate) problem.
It still might be a problem later once the toolchains binaries get fixed.
All that said, it does seem to be working on my machine (not really a surprise, I suppose). Can you run it with the environment variable RUSTUP_TRACE=1
and the -v
flag? That will show exactly what commands it runs.
For example, on my local machine:
❯ RUSTUP_TRACE=1 puppet apply -ve 'rustup { daniel: toolchains => ["nightly-aarch64-apple-darwin"], targets => [] }'
Info: Loading facts
Info: Loading facts
Notice: Compiled catalog for marlow.local in environment production in 0.19 seconds
Info: Using environment 'production'
Info: Applying configuration version '1707364703'
Info: rustup_internal: as daniel: /Users/daniel/.cargo/bin/rustup show
Info: rustup_internal: as daniel: /Users/daniel/.cargo/bin/rustup toolchain list
Notice: /Stage[main]/Main/Rustup[daniel]/Rustup_internal[daniel]/toolchains: toolchains changed [
{
'ensure' => 'present',
'name' => 'stable-x86_64-apple-darwin'
},
{
'ensure' => 'present',
'name' => 'nightly-x86_64-apple-darwin'
},
{
'ensure' => 'present',
'name' => '1.64-x86_64-apple-darwin'
},
. . .
{
'ensure' => 'present',
'name' => '1.69.0-x86_64-apple-darwin'
}] to [
{
'ensure' => 'present',
'name' => 'nightly-aarch64-apple-darwin',
'profile' => 'default'
}]
Info: rustup_internal: as daniel: /Users/daniel/.cargo/bin/rustup target list --installed --toolchain stable-x86_64-apple-darwin
Info: rustup_internal: as daniel: /Users/daniel/.cargo/bin/rustup target list --installed --toolchain nightly-x86_64-apple-darwin
Info: rustup_internal: as daniel: /Users/daniel/.cargo/bin/rustup target list --installed --toolchain 1.64-x86_64-apple-darwin
. . .
Info: rustup_internal: as daniel: /Users/daniel/.cargo/bin/rustup target list --installed --toolchain 1.69.0-x86_64-apple-darwin
Info: rustup_internal: as daniel: /Users/daniel/.cargo/bin/rustup toolchain install --no-self-update --force-non-host --profile default nightly-aarch64-apple-darwin
Info: rustup_internal: as daniel: /Users/daniel/.cargo/bin/rustup toolchain list
Notice: Applied catalog in 13.45 seconds
❯ file ~/.rustup/toolchains/nightly-*-apple-darwin/bin/cargo
/Users/daniel/.rustup/toolchains/nightly-aarch64-apple-darwin/bin/cargo: Mach-O 64-bit executable arm64
/Users/daniel/.rustup/toolchains/nightly-x86_64-apple-darwin/bin/cargo: Mach-O 64-bit executable x86_64
My output looks exactly like yours when I run puppet apply
, but the issue is the cargo
binary being used. The ones in my toolchain directories do match the toolchain's arch. However, the cargo
in ~/.cargo/bin/cargo
matches the arch of the program that installed it, and that's the cargo
on my PATH (is that bad? Should I change that?).
If I install it via puppet apply
, it's ARM. If I do it via bolt
, it's x86, because bolt
ix x86. Unfortunately, bolt
seems to be swallowing debug output related to RUSTUP_TRACE; here's my command and output:
> RUSTUP_TRACE=1 bolt apply --verbose --log-level debug --target localhost --run-as root --execute 'rustup { "zac": toolchains => ["nightly-aarch64-apple-darwin"], targets => ["aarch64-apple-darwin"] }'
<snip, lots of facts>
Finished: task apply_helpers::custom_facts with 0 failures in 12.62 sec
Finished: install puppet and gather facts with 0 failures in 12.89 sec
Finished: install puppet and gather facts with 0 failures in 12.89 sec
Starting: apply catalog on localhost
Starting: apply catalog on localhost
Applying manifest block on ["localhost"]
Running task 'apply_helpers::apply_catalog' on localhost
Started on localhost...
[{"target":"localhost","action":"apply","object":null,"status":"success","value":{"report":{"host":"atropos.local","time":"2024-02-08T10:57:32.795239000-05:00","configuration_version":1707407840,"transaction_uuid":null,"report_format":12,"puppet_version":"7.26.0","status":"changed","transaction_completed":true,"noop":false,"noop_pending":false,"environment":null,"logs":[{"level":"warning","message":"Private key for 'atropos.local' does not exist","source":"Puppet","tags":["warning"],"time":"2024-02-08T10:57:32.846257000-05:00","file":null,"line":null},{"level":"warning","message":"Client certificate for 'atropos.local' does not exist","source":"Puppet","tags":["warning"],"time":"2024-02-08T10:57:32.846321000-05:00","file":null,"line":null},{"level":"notice","message":"created","source":"/Stage[main]/Main/Rustup[zac]/Rustup_internal[zac]/ensure","tags":["notice","rustup_internal","zac","rustup","class"],"time":"2024-02-08T10:57:34.107290000-05:00","file":"/Users/zac/Desktop/Projects/Personal/zbox/.modules/rustFinished on localhost:
up/manifests/init.pp","line":81},{"level":"notice","message":"Applied catalog in 13.23 seconds","source":"Puppet","tags":["notice"],"time":"2024-02-08T10:57:46.068321000-05:00","file":null,"line":null}],"metrics":{"resources":{"name":"resources","label":"Resources","values":[["total","Total",1],["skipped","Skipped",0],["failed","Failed",0],["failed_to_restart","Failed to restart",0],["restarted","Restarted",0],["changed","Changed",1],["out_of_sync","Out of sync",1],["scheduled","Scheduled",0],["corrective_change","Corrective change",0]]},"time":{"name":"time","label":"Time","values":[["rustup_internal","Rustup internal",13.223539],["config_retrieval","Config retrieval",0],["transaction_evaluation","Transaction evaluation",13.226376999984495],["catalog_application","Catalog application",13.231105000013486],["total","Total",13.273145]]},"changes":{"name":"changes","label":"Changes","values":[["total","Total",1]]},"events":{"name":"events","label":"Events","values":[["total","Tot Notice: /Stage[main]/Main/Rustup[zac]/Rustup_internal[zac]/ensure: created
al",1],["failure","Failure",0],["success","Success",1]]}},"resource_statuses":{"Rustup_internal[zac]":{"title":"zac","file":"/Users/zac/Desktop/Projects/Personal/zbox/.modules/rustup/manifests/init.pp","line":81,"resource":"Rustup_internal[zac]","resource_type":"Rustup_internal","provider_used":"default","containment_path":["Stage[main]","Main","Rustup[zac]","Rustup_internal[zac]"],"evaluation_time":13.223539,"tags":["rustup_internal","zac","rustup","class"],"time":"2024-02-08T10:57:32.839099000-05:00","failed":false,"failed_to_restart":false,"changed":true,"out_of_sync":true,"skipped":false,"change_count":1,"out_of_sync_count":1,"events":[{"audited":false,"property":"ensure","previous_value":"absent","desired_value":"present","historical_value":null,"message":"created","name":"rustup_internal_created","status":"success","time":"2024-02-08T10:57:32.839219000-05:00","redacted":null,"corrective_change":false}],"corrective_change":false}},"corrective_change":false,"catalog_uuid":"1d5863e1-6e89-4578-80ff-3f9caa22af36","cached_catalog_status":"not_used"},"_output":"changed: 1, failed: 0, unchanged: 0 skipped: 0, noop: 0","_sensitive":"Sensitive [value redacted]"}}]
changed: 1, failed: 0, unchanged: 0 skipped: 0, noop: 0
Finished: apply catalog with 0 failures in 26.74 sec
Finished: apply catalog with 0 failures in 26.74 sec
Successful on 1 target: localhost
Ran on 1 target in 39.67 sec
This is easier to reproduce without Puppet entirely. If I brew install rustup
, I get an ARM rustup-init
binary:
∴ file $(which rustup-init)
/opt/homebrew/bin/rustup-init: Mach-O 64-bit executable arm6
If I run rustup-init
from an ARM shell (default answers), once it's done I have an ARM ~/.cargo/bin/cargo
:
zac@atropos ~ ∴ which cargo
/Users/zac/.cargo/bin/cargo
zac@atropos ~ ∴ file $(which cargo)
/Users/zac/.cargo/bin/cargo: Mach-O 64-bit executable arm64
But if I run it via this module in an x86 bolt
, things malfunction.
The linked PR does seem to work, and always installs the requisite base/bootstrap toolchain as the native architecture.
Great, thanks! Merged, with only a bit of struggling with CI.
I imagine that fixes the problem with the wrappers being the wrong arch, but I would not have expected this to fix the issue where ~/.rustup/toolchains/nightly-aarch64-apple-darwin/bin/* are x86_64 instead of ARM. Can you confirm that?
I imagine that fixes the problem with the wrappers being the wrong arch, but I would not have expected this to fix the issue where ~/.rustup/toolchains/nightly-aarch64-apple-darwin/bin/* are x86_64 instead of ARM. Can you confirm that?
Sorry, I could have been clearer in the earlier comment: the issue was only ever with ~/.cargo/bin/cargo
; the architectures in ~/.rustup/toolchains
were always the correct ones for their respective toolchains. If that should change, I can apply a similar fix to their bootstrap code, just let me know!
Ah, okay. That makes a lot more sense.
I’ve just cut a release that includes this. Do you think anything else needs to be done before closing this?
Thanks again for report and the PR!
Just tested the new release, and my test succeeded on my laptop, but failed on a blank MacOS VM.
That's entirely my fault. My facter
environment downcases the OS family for .... frankly silly reasons (which routinely cause me issues) that I should undo. The code in the PR I submitted is a no-op on all Macs except mine.
The fix is simple and will be linked right after this comment; upcase the "d" in "Darwin".
So sorry, I should have tested on a fresh install before I sent that in and made you spend the time.
Please don't feel pressured to cut a second release any time soon; this is a fairly rare issue and I'm happy to pin to master until whenever's convenient for you.
Ha ha, no problem. I should have caught that in review.
Tested on main
just now and this is confirmed fix. Sorry for all the back-n-forth, and thanks for authoring and maintaining this module!
Ok, I cut a new release with an additional fix for macOS (in rustup::global
).
Context/Setup
brew install --cask puppetlabs/puppet/puppet-bolt
, the suggested install method.arm64
, and only provides anx86_64
installer:https://downloads.puppet.com/mac/puppet-tools/12/x86_64/puppet-bolt-3.27.4-1.osx12.dmg
at the time of this writing. This means thatbolt
will run via Rosetta in x86 emulation mode.rustup self uninstall
followed byrm -rf ~/.rustup ~/.cargo
should reset things to this state, albeit somewhat destructively).To reproduce
$USER
):file $(which cargo)
.Expected behavior:
cargo
andrustc
report file types matching the host architecture. For an ARM mac that would look likeObserved behavior
Why this matters
So what happens if
cargo
is a non-native architecture?cargo
's build behavior. It andrustc
will still compile/cross-compile Rust code normally according to the targets configured inrustup
and will produce binaries of the selected architecture.cargo
andrustc
will run unusually slowly on MacOS due to the presence of binary emulation translating the x86_64 binaries into arm64 code.cargo
and their subprograms will prefer to launch x86 programs on MacOS. In other words, ifcargo run
or some Cargo plugin runs a universal binary on the system (like, saysudo
), that subprogram will run in x86 mode. Even if spawned subprograms do not support x86 execution (in which case cargo will spawn them as arm64), those programs' spawned subprocesses will still run preferring x86 execution.That last one can cause problems. I encountered one such issue where the
cargo flamegraph
plugin supplied byflamegraph-rs
runs its innerdtrace
process as x86, causingdtrace
to fail when it traced arm64 processes. The bug report for that issue is here, and I blogged about the discovery process of this issue here.Given that rustup/
cargo
installations can persist untouched for long periods of time, and given that the main compiler suite still works, I think it's likely that incorrect-architecture installations of cargo/rustc by this module persist without users' knowledge in many cases, silently causing slowdowns and occasionally causing louder issues like the above.Suggested resolution
When this module bootstraps
cargo
with therustup
resource, or when it installs a per-toolchaincargo
via therustup::toolchain
resource, it should allow user specification of the architecture to those resources.The new
architecture =>
ortoolchain_architecture =>
parameter that users can specify should, if unspecified, default to the invoking machine's native architecture via the "architecture" fact.When this module invokes subprograms internally, they should be wrapped in the arch command to force the architecture as requested by the user.
This behavior need only be present on MacOS, as other platforms do not do multiarch/binfmt translation in the same way (and the issue is much more common on MacOS than on free operating systems due to the frequently limited availability of source builds, as is the case with
puppet-bolt
in this case: only an x86_64 binary installable is provided, and build steps for creating a correct executable are not readily available).