facebook / buck2

Build system, successor to Buck
https://buck2.build/
Apache License 2.0
3.53k stars 217 forks source link

sh_binary rule for windows is incorrectly trying to resolve symlinks during remote execution #673

Open valadez-abel opened 4 months ago

valadez-abel commented 4 months ago

On windows the sh_binary rule is incorrectly trying to resolve symlinks directly, causing paths to not be correct on the re environment that we have setup:

https://github.com/facebook/buck2/blob/2214ec241cefe357faf499e30b26629ce6728c8c/prelude/sh_binary.bzl#L92

As you can see, this ends up traversing to the parent on the cache, which doesn't translate to it being the parent directory of the other files:

Action failed: toolchains//:test_re (test)
Remote command returned non-zero exit code 1
stdout:

C:\worker\work\3\0\exec>setlocal EnableDelayedExpansion 

C:\worker\work\3\0\exec>set __RESOURCES_ROOT=.\resources 

C:\worker\work\3\0\exec>set __SRC=C:\worker\work\3\0\exec\buck-out\v2\gen\prelude\749e8add6e5f1364\python_bootstrap\tools\__win_python_wrapper__\win_python_wrapper.bat 

C:\worker\work\3\0\exec>for /F "tokens=2 delims=[]" %a in ('dir C:\worker\work\3\0\exec\buck-out\v2\gen\prelude\749e8add6e5f1364\python_bootstrap\tools\__win_python_wrapper__\win_python_wrapper.bat |C:\Windows\System32\find.exe "<SYMLINK>"') do set "__SRC=%a" 

C:\worker\work\3\0\exec>set "__SRC=\\?\C:\worker\cas\77\7786b38fdc6164bc83a39fa48e33d783a9560d386c7a765d5e09f777ff3d9c71" 

C:\worker\work\3\0\exec>for %a in ("\\?\C:\worker\cas\77\7786b38fdc6164bc83a39fa48e33d783a9560d386c7a765d5e09f777ff3d9c71") do set "__SCRIPT_DIR=%~dpa" 

**C:\worker\work\3\0\exec>set "__SCRIPT_DIR=\\?\C:\worker\cas\77\"** 

C:\worker\work\3\0\exec>set BUCK_SH_BINARY_VERSION_UNSTABLE=2 

C:\worker\work\3\0\exec>set BUCK_PROJECT_ROOT=\\?\C:\worker\cas\77\\.\resources 

C:\worker\work\3\0\exec>set BUCK_DEFAULT_RUNTIME_RESOURCES=\\?\C:\worker\cas\77\\.\resources 

C:\worker\work\3\0\exec>\\?\C:\worker\cas\77\\.\resources\win_python_wrapper.bat buck-out\v2\gen\prelude\749e8add6e5f1364\cxx\tools\__dep_file_processor__\__dep_file_processor__ buck-out\v2\gen-anon\toolchains\749e8add6e5f1364443bc5d1e95a38d6\__x-python__\out\x-python\python.exe buck-out\v2\gen\prelude\749e8add6e5f1364\cxx\tools\__dep_file_processor__\dep_file_processor.py 
stderr:
**The system cannot find the path specified.**

This specific step should be removed, or made configurable via an attribute on the rule. Happy to submit a fix for this

JakobDegen commented 3 months ago

@valadez-abel can you be a bit more precise about where exactly the bug in the line you linked is? Also, can you share what RE implementation you're using? From looking at your logs, it looks to me like the script that buck is writing out is getting materialized as a symlink into a cas directory on the RE side, but I'm not sure I'm understanding correctly yet

valadez-abel commented 3 months ago

The linked line ends up resolving: __SRC=C:\worker\work\3\0\exec\buck-out\v2\gen\prelude\749e8add6e5f1364\python_bootstrap\tools\__win_python_wrapper__\win_python_wrapper.bat into: __SRC=\\?\C:\worker\cas\77\7786b38fdc6164bc83a39fa48e33d783a9560d386c7a765d5e09f777ff3d9c71

Which on it's own is fine, but then:

C:\worker\work\3\0\exec>for %a in ("\\?\C:\worker\cas\77\7786b38fdc6164bc83a39fa48e33d783a9560d386c7a765d5e09f777ff3d9c71") do set "__SCRIPT_DIR=%~dpa"

attempts to get the parent directory of the resolved location: C:\worker\work\3\0\exec>set "__SCRIPT_DIR=\\?\C:\worker\cas\77\"

which ends up not being correct to find siblings of the original file.

The bug in this case ends up being the assumption that you can follow the links in the re-side to an underlying tree that matches the structure. The directory itself will contain symlinks to the cas for the files that are there, but resolving any said links to the underlying location and then attempt traversals will fail.

We're using EngFlow's RE implementation. @TheGrizzlyDev can provide more details on how exactly the local file-system and underlying cache are setup.

Note that symbolic links have only been a problem on the windows side.

TheGrizzlyDev commented 3 months ago

I am not exactly convinced this is an issue caused by the remote execution server. I think the script assumes that win_python_wrapper.bat is indeed a simlink of sort, but it seems to me that buck2 will consider source inputs as normal blobs in a Directory message when setting up the input root for an action, rather than as a symlink. As such the file in question is not a symlink pointing at the same location it points to locally and the action should behave as if it was a file. So either buck2 keeps treating it as a symlink all the way through and correctly sets up the input root to preverse that or this script in question should stop making this assumption when running remotely. Wdyt? I think making this step configurable like @valadez-abel suggested would be a good stop-gap solution that fixes builds, but I think in the longer run buck2 should try preserving the file system as much as possible on RE. Let me know if I am missing anything though!