Open jtilahun opened 1 year ago
In testing something unrelated, I was able to reproduce a similar behavior. By running pants in an uninitialized directory and responding n
, I got a similar error.
~/devel$ pants
No Pants configuration was found at or above /home/nathanael/devel.
Would you like to configure /home/nathanael/devel as a Pants project? (Y/n): n
Error: Isolates your Pants from the elements.
Please select from the following boot commands:
scie-pants
bootstrap-tools
pants
pants-debug
update
You can select a boot command by passing it as the 1st argument or else by setting the SCIE_BOOT environment variable.
ERROR: Failed to establish atomic directory /home/nathanael/.cache/nce/65aa4f2a6c1f9bac672c0df94ae34c7170e5c071cda35e9b725945831905c122/locks/configure-38caab2c120194c12f9617ad3a9ed1c094483156c068196f12097cc18bf6ac39. Population of work directory failed: Boot binding command failed: exit status: 1
Hm, the contents of the log isn't very insightful 🤔
2023-08-30 16:30:52,614 ERROR] root: Install failed: Command '['/home/jtilahun/tools/bin/pants']' returned non-zero exit status 1.
More information can be found in the log at: /home/jtilahun/.cache/nce/65aa4f2a6c1f9bac672c0df94ae34c7170e5c071cda35e9b725945831905c122/bindings/logs/record-scie-pants-info.log
Traceback (most recent call last):
File "/home/jtilahun/.cache/nce/65aa4f2a6c1f9bac672c0df94ae34c7170e5c071cda35e9b725945831905c122/bindings/pex_root/venvs/557963c6782fafb82fc618ff05bb5998dafccd3c/f8df9e2cb55d2d123e1c6f4f3701f3010386f4bb/pex", line 284, in <module>
sys.exit(func())
File "/home/jtilahun/.cache/nce/65aa4f2a6c1f9bac672c0df94ae34c7170e5c071cda35e9b725945831905c122/bindings/pex_root/venvs/557963c6782fafb82fc618ff05bb5998dafccd3c/f8df9e2cb55d2d123e1c6f4f3701f3010386f4bb/lib/python3.9/site-packages/conscript/main.py", line 105, in main
return ep.load()()
File "/home/jtilahun/.cache/nce/65aa4f2a6c1f9bac672c0df94ae34c7170e5c071cda35e9b725945831905c122/bindings/pex_root/venvs/557963c6782fafb82fc618ff05bb5998dafccd3c/f8df9e2cb55d2d123e1c6f4f3701f3010386f4bb/lib/python3.9/site-packages/scie_pants/record_scie_pants_info.py", line 37, in main
version = subprocess.run(
File "/home/jtilahun/.cache/nce/2b6e146234a4ef2a8946081fc3fbfffe0765b80b690425a49ebe40b47c33445b/cpython-3.9.16+20230507-x86_64-unknown-linux-gnu-install_only.tar.gz/python/lib/python3.9/subprocess.py", line 528, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['/home/jtilahun/tools/bin/pants']' returned non-zero exit status 1.
What version of scie-pants
are do you have installed? PANTS_BOOTSTRAP_VERSION=report pants
Huh, it's unfortunate that the log isn't insightful.
I have scie-pants
version 0.10.0 installed.
jtilahun@JTN86G3:~/devel/monorepo$ PANTS_BOOTSTRAP_VERSION=report pants
0.10.0
Thanks @jtilahun ... that's the latest, so there's definitely something to look at here.
Here's some general questions that might narrow things down somewhat, maybe:
~/devel/monorepo
repo? (i.e. somewhere that doesn't have pants.toml
)SCIE_BOOT=update strace -v -e trace=execve -e verbose=execve --follow-forks --string-limit=300 pants 2> strace.log
and upload the log? You may need to install https://strace.io (this traces all of the subprocess invocations by recording the execve
syscalls, so we can hopefully narrow down exactly which part of the processing fails)Here are answers to those questions:
scie-pants
by using the get-pants.sh
script referenced in the Pants installation documentation (link). I've also attached the exact version of the get-pants.sh
script I've been using for completeness: get-pants.sh~/devel/monorepo
repo. For example, if I run it in ~/tools
, which doesn't have pants.toml
, it fails:
jtilahun@JTN86G3:~/tools$ SCIE_BOOT=update pants
Error: Isolates your Pants from the elements.
Please select from the following boot commands:
scie-pants bootstrap-tools pants pants-debug update
You can select a boot command by passing it as the 1st argument or else by setting the SCIE_BOOT environment variable.
ERROR: Failed to expand home dir in path ~/.nce Install failed: Command '['/home/jtilahun/bin/pants']' returned non-zero exit status 1. More information can be found in the log at: /home/jtilahun/.cache/nce/65aa4f2a6c1f9bac672c0df94ae34c7170e5c071cda35e9b725945831905c122/bindings/logs/record-scie-pants-info.log
Error: Isolates your Pants from the elements.
Please select from the following boot commands:
scie-pants bootstrap-tools pants pants-debug update
You can select a boot command by passing it as the 1st argument or else by setting the SCIE_BOOT environment variable.
ERROR: Failed to establish atomic directory /home/jtilahun/.cache/nce/65aa4f2a6c1f9bac672c0df94ae34c7170e5c071cda35e9b725945831905c122/locks/scie-pants-info-a5078db971917d69fb5962395c17cd62b53c7a229697b2227627a3c28242f7d7. Population of work directory failed: Boot binding command failed: exit status: 1
4. I ran `SCIE_BOOT=update strace -v -e trace=execve -e verbose=execve -f --string-limit=300 pants 2> strace.log` in `~/devel/monorepo`. Notice that I replaced `--follow-forks` with `-f` because my `strace` does not recognize the `--follow-forks` option but does recognize the `-f` option. My `man` page for `strace(1)` seems to indicate that it traces child processes created by `fork(2)`:
-f Trace child processes as they are created by currently traced processes as a result of the fork(2), vfork(2) and clone(2) system calls. Note that -p PID -f will attach all threads
of process PID if it is multi-threaded, not only thread with thread_id = PID.
Here's the log that you requested I upload: [strace.log](https://github.com/pantsbuild/scie-pants/files/12533687/strace.log)
Thanks.
It looks like the error is during a recursive invocation, https://github.com/pantsbuild/scie-pants/blob/4df586c25f8698a1734dedf0c8351af249c1f2d3/tools/src/scie_pants/record_scie_pants_info.py#L37-L43 which is invoked by scie / lift https://github.com/pantsbuild/scie-pants/blob/4df586c25f8698a1734dedf0c8351af249c1f2d3/package/scie-pants.toml#L152-L165
The error almost certainly comes from https://github.com/a-scie/jump/blob/b7b1efbc9ca276da759e1b2b74e3ecd7d5bbaffc/jump/src/context.rs#L37-L38. That error occurring suggests https://docs.rs/dirs/5.0.1/dirs/fn.home_dir.html is returning None
, which seems like it can only happen in limited conditions on Linux: $HOME
is none and getpwuid_r
doesn't return useful info.
The strace log explicitly shows HOME=/home/jtilahun
in the first two execve calls, but not in the third one, which is the one that fails. That last one is just (env vars are the third parameter):
[pid 90871] execve("/home/jtilahun/bin/pants", ["/home/jtilahun/bin/pants"], ["PANTS_BOOTSTRAP_VERSION=report"]) = 0
This aligns with the env={...}
parameter in record_scie_pants_info.py
, and suggest a fix would be ensuring that call inherits os.environ
too: env={**os.environ, "PANTS_BOOTSTRAP_VERSION": "report"}
to ensure that HOME
is set.
@jtilahun do you feel like submitting a pull request with that change?
It's a bit weird to me that this is the first observation of this failure, with SCIE_BOOT=update pants
working on other systems (e.g. my mac). My theory is that getpwuid_r
usually works (so things have been working fine without HOME
set), but @jtilahun's user account is configured in a way that doesn't work as smoothly with getpwuid_r
? I don't eprsonally know enough about Linux user management to know where to start there, though!
@engnatha I think that's a separate issue, which I filed as https://github.com/pantsbuild/scie-pants/issues/266. Thanks for flagging.
Hmm, there's something going on that I haven't grasped quite yet.
I tried isolating this to a minimum reproducible example of dirs::home_dir
failing. Here's what I have:
src/main.rs
fn main() {
match dirs::home_dir() {
Some(path) => println!("Your home directory, probably: {}", path.display()),
None => println!("Impossible to get your home dir!"),
}
}
Cargo.toml
[package]
name = "monorepo"
version = "0.1.0"
edition = "2021"
[[bin]]
edition = "2021"
name = "main"
path = "src/main.rs"
[dependencies]
dirs = "4.0"
I built the binary with cargo build --bin main
. Manually tinkering with $HOME
, I've convinced myself that in the absence of $HOME
, dirs::home_dir
goes somewhere else to find my home directory and does so successfully:
jtilahun@JTN86G3:~/devel/monorepo/target/debug$ ./main
Your home directory, probably: /home/jtilahun
jtilahun@JTN86G3:~/devel/monorepo/target/debug$ HOME="" ./main
Your home directory, probably: /home/jtilahun
jtilahun@JTN86G3:~/devel/monorepo/target/debug$ HOME=" " ./main
Your home directory, probably:
jtilahun@JTN86G3:~/devel/monorepo/target/debug$ HOME="not_a_real_home_directory" ./main
Your home directory, probably: not_a_real_home_directory
I don't feel like I understand what's happening. I don't want to submit a pull request until I feel like I have a better understanding of what's happening.
Yes, I agree with investigating more given my theory doesn't seem to hold. Thanks for checking!
What happens if you run it without any env vars at all: env -i ./main
?
If I run it without any env vars at all, it's still able to find my home directory:
jtilahun@JTN86G3:~/devel/monorepo/target/debug$ env -i ./main
Your home directory, probably: /home/jtilahun
Hm, I note that you've set dirs = "4.0"
there, but scie-pants uses 5.0.1
. It doesn't look like there's significant changes between the versions, but there's a chance that might be the difference... could you try with a newer dirs and dirs-sys?
I tried with a newer dirs
and dirs-sys
, but no difference.
jtilahun@JTN86G3:~/devel/monorepo/target/debug$ env -i ./main
Your home directory, probably: /home/jtilahun
Here's my Cargo.toml
file now:
[package]
name = "monorepo"
version = "0.1.0"
edition = "2021"
[[bin]]
edition = "2021"
name = "main"
path = "src/main.rs"
[dependencies]
dirs = "5.0"
Note that scie-pants
sets "dirs = 5.0"
in Cargo.toml
:
https://github.com/pantsbuild/scie-pants/blob/4df586c25f8698a1734dedf0c8351af249c1f2d3/Cargo.toml#L29
Here's my Cargo.lock
file for sanity checking: Cargo.lock
Note that my example package uses "5.0.1"
.
I haven't known where ~/.nce
comes from, given that no such file or directory exists for me:
jtilahun@JTN86G3:~$ ls ~/.nce
ls: cannot access '/home/jtilahun/.nce': No such file or directory
Searching the repo, I found the one result here: https://github.com/a-scie/jump/blob/71d2a9d9f7f197cf185fd48426e46ee026fb4587/jump/src/context.rs#L241. Reading the surrounding code, it looks as if it's trying to set up a context of some sort. To get the base directory, it first checks SCIE_BASE
, followed by a couple of other places. If it still can't find a directory, then it defaults to ~/.nce
for whatever reason.
So I wondered what would happen if I were to set SCIE_BASE
to "~/.nce"
on my own. Surprisingly, doing so results in different behavior. It creates a directory at the path "~/.nce"
on my behalf and also appears to download some archive. After a few seconds, it finally errors out. Screen recording attached.
https://github.com/pantsbuild/scie-pants/assets/26139374/e727e0ac-c1e2-4e78-b246-e62e473b4a26
So I'm thinking that there's some funny business going on with the directory handling logic. I still haven't pinpointed exactly what it is, but something smells fishy.
I just ran into this issue on an ubuntu laptop. The relevant bit in the strace log shows
[pid 3062956] execve("/home/jafloyd/.local/bin/pants", ["/home/jafloyd/.local/bin/pants"], ["PANTS_BOOTSTRAP_VERSION=report"]) = 0
Error: Failed to expand home dir in path ~/.nce
My user account comes from active directory via sssd
on the laptop, so it is not present in /etc/passwd
and similar files. I also have some sss_override
s configured so that my uid/gid/home_dir and other user settings are sane (not a uid in the billions, and a much more concise home directory).
I can reproduce the SCIE_BOOT=update
error more simply by doing this (I'm using bash as my shell here):
$ unset HOME
$ PANTS_BOOTSTRAP_VERSION=report pants
Error: Failed to expand home dir in path ~/.nce
Isolates your Pants from the elements.
Please select from the following boot commands:
<default> (when SCIE_BOOT is not set in the environment) Detects the current Pants installation and launches it.
bootstrap-tools Introspection tools for the Pants bootstrap process.
update Update scie-pants.
You can select a boot command by setting the SCIE_BOOT environment variable.
So, @jtilahun, you tested your mini rust program with HOME=""
and HOME=" "
, but did you try running your mini program once HOME is not set?
edit: Oh. I see you used env -i
to try that. I get the same failure if I do that. I also get it for any other use of the scie-pants binary (I just manually downloaded/updated to 0.11.0.
$ env -i PANTS_BOOTSTRAP_VERSION=report ~/.local/bin/pants
Error: Failed to expand home dir in path ~/.nce
[snip]
$ env -i ~/.local/bin/pants version
Error: Failed to expand home dir in path ~/.nce
[snip]
Attempting to upgrade the
pants
launcher binary on my computer results in an installation error. Full output and log file can be found below.record-scie-pants-info.log