Open valadez-abel opened 4 months ago
@valadez-abel can you be a bit more precise about where exactly the bug in the line you linked is? Also, can you share what RE implementation you're using? From looking at your logs, it looks to me like the script that buck is writing out is getting materialized as a symlink into a cas directory on the RE side, but I'm not sure I'm understanding correctly yet
The linked line ends up resolving: __SRC=C:\worker\work\3\0\exec\buck-out\v2\gen\prelude\749e8add6e5f1364\python_bootstrap\tools\__win_python_wrapper__\win_python_wrapper.bat
into:
__SRC=\\?\C:\worker\cas\77\7786b38fdc6164bc83a39fa48e33d783a9560d386c7a765d5e09f777ff3d9c71
Which on it's own is fine, but then:
C:\worker\work\3\0\exec>for %a in ("\\?\C:\worker\cas\77\7786b38fdc6164bc83a39fa48e33d783a9560d386c7a765d5e09f777ff3d9c71") do set "__SCRIPT_DIR=%~dpa"
attempts to get the parent directory of the resolved location:
C:\worker\work\3\0\exec>set "__SCRIPT_DIR=\\?\C:\worker\cas\77\"
which ends up not being correct to find siblings of the original file.
The bug in this case ends up being the assumption that you can follow the links in the re-side to an underlying tree that matches the structure. The directory itself will contain symlinks to the cas for the files that are there, but resolving any said links to the underlying location and then attempt traversals will fail.
We're using EngFlow's RE implementation. @TheGrizzlyDev can provide more details on how exactly the local file-system and underlying cache are setup.
Note that symbolic links have only been a problem on the windows side.
I am not exactly convinced this is an issue caused by the remote execution server. I think the script assumes that win_python_wrapper.bat
is indeed a simlink of sort, but it seems to me that buck2 will consider source inputs as normal blobs in a Directory message when setting up the input root for an action, rather than as a symlink. As such the file in question is not a symlink pointing at the same location it points to locally and the action should behave as if it was a file. So either buck2 keeps treating it as a symlink all the way through and correctly sets up the input root to preverse that or this script in question should stop making this assumption when running remotely. Wdyt? I think making this step configurable like @valadez-abel suggested would be a good stop-gap solution that fixes builds, but I think in the longer run buck2 should try preserving the file system as much as possible on RE. Let me know if I am missing anything though!
On windows the sh_binary rule is incorrectly trying to resolve symlinks directly, causing paths to not be correct on the re environment that we have setup:
https://github.com/facebook/buck2/blob/2214ec241cefe357faf499e30b26629ce6728c8c/prelude/sh_binary.bzl#L92
As you can see, this ends up traversing to the parent on the cache, which doesn't translate to it being the parent directory of the other files:
This specific step should be removed, or made configurable via an attribute on the rule. Happy to submit a fix for this