llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.6k stars 11.82k forks source link

lld-link fails if PDB path is an Alternate Data Stream #44627

Open 99a6217f-7cd1-48e2-b5c5-98c2bfb5991b opened 4 years ago

99a6217f-7cd1-48e2-b5c5-98c2bfb5991b commented 4 years ago
Bugzilla Link 45282
Version unspecified
OS other
CC @rnk

Extended Description

LLVM 9.0.0 on Windows 7 64-bit.

When the path for the PDB file is set to an Alternate Data Stream (either on the resulting binary itself or some unrelated file) then the linking fails with the following error:

Stream Error: An I/O error occurred on the file system.

This error does not occur if either Microsoft's link.exe is used or if the linking is done using clang-cl.exe (which seems to default to link.exe).

To reproduce this:

The following C++-code as test.cpp (on a NTFS or ReFS volume):

int main(int argc, char** argv) { return 1; }

Execute the following commands (note: the libpaths will likely vary for your system):

clang-cl.exe /c test.cpp lld-link.exe /OUT:test.exe /DEBUG:FULL /PDB:test.exe:pdb "/libpath:C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.25.28610\lib\x64" "/libpath:C:\Program Files (x86)\Windows Kits\10\Lib\10.0.18362.0\um\x64" "/libpath:C:\Program Files (x86)\Windows Kits\10\Lib\10.0.18362.0\ucrt\x64" .\test.obj

The second command will fail with the above mentioned error.

If I however replace lld-link.exe with link.exe (with the VS environment set up correctly) the linking will succeed. The linking will also succeed if I use this:

clang-cl.exe test.cpp /link /debug:full /pdb:test.exe:pdb

The error also does not depend on whether the PDB file is in the target binary or not. For example the path "/pdb:test.txt:pdb" will fail as well while it will succeed with both link.exe and clang-cl.exe even if test.txt does not exist yet.

rnk commented 4 years ago

Is the PDB file generated in a temporary location as well or at the final location?

May I ask for the rational why the binary is created in a temporary location? For example Visual Studio does not do this.

This is a pretty standard strategy for ensuring atomic file updates on Unix. Instead of rewriting the file in place, write the new output to a temp file next to the existing file, and atomically rename over the old file. Even if the process is killed or the computer is turned off, either the old file exists, or the new file exists.

Please also note my comment that using a completely unrelated file as target for the PDB (in a ADS) did not work either, so there is already a problem with using an ADS at all on Windows.

Yes. I suspect the APIs that we use to do the file rename just don't support ADSs, so it would apply to most LLVM tools.

I noticed that llvm-dis is not affected, probably because it doesn't use the ToolOutputFile class that handles the details of temporary file renaming.

We can leave this open, but by closing it, I only meant to indicate that we are unlikely to take action in the near term. I suspect we can't fix this defect without completely overhauling our output file writing strategy, and supporting ADSs is not high enough priority to motivate that. There are other reasons to stop using temporary files (they sometimes leak), so it is possible that we will for other reasons. At that point, ADS output paths might just start working.

I guess another way of recording that state would be to file a bug about temp file leakage, and mark this bug blocked on that bug. Personally, I haven't found the time to precisely document the conditions under which temp file leakage occurs, so that bug probably doesn't exist yet.

99a6217f-7cd1-48e2-b5c5-98c2bfb5991b commented 4 years ago

Is the PDB file generated in a temporary location as well or at the final location?

May I ask for the rational why the binary is created in a temporary location? For example Visual Studio does not do this.

Please also note my comment that using a completely unrelated file as target for the PDB (in a ADS) did not work either, so there is already a problem with using an ADS at all on Windows.

rnk commented 4 years ago

I had heard of alternate data streams from Mac, but I hadn't realized they were a feature of NTFS too until this issue.

Fundamentally LLD is trying to:

The call to TempFile::keep defined here returns an error: https://github.com/llvm/llvm-project/blob/d8981ce5b9f8caa567613b2bf5aa3095e0156130/llvm/lib/Support/Path.cpp#L1183

I think fixing it would be quite involved, and probably depends on changing our entire output file writing scheme. I don't think it's feasible.