Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

Refactor all LLVM tools to use a common output file management library #33042

Open Quuxplusone opened 7 years ago

Quuxplusone commented 7 years ago
Bugzilla Link PR34070
Status NEW
Importance P enhancement
Reported by Reid Kleckner (rnk@google.com)
Reported on 2017-08-04 11:33:16 -0700
Last modified on 2017-08-14 18:00:29 -0700
Version trunk
Hardware PC Windows NT
CC ditaliano@apple.com, geek4civic@gmail.com, llvm-bugs@lists.llvm.org, rafael@espindo.la
Fixed by commit(s)
Attachments
Blocks
Blocked by PR34072
See also
LLVM tools generally produce output and write it to a file. However, they all
do it with varying levels of good behavior. Ideally, a tool has these
properties:

1. Atomicity: Regardless of how the tool exits (success, crash, SIGKILL) the
output file should either be complete and valid, or it does not exist.
2. No leaks: During normal operation (successful execution or crash, but not
SIGKILL), no temporary files should be left behind after the tool exits.

We need to beef up ToolOutputFile and FileOutputBuffer to have these
properties, and then refactor all LLVM tools to use them. Right now, neither
successfully deletes their temporary files on Windows because we keep the file
descriptors open.

Ideally, we should deprecate RemoveFileOnSignal and sink all this logic into
Support.
Quuxplusone commented 7 years ago
I think the ideal implementation strategy on Windows is to leverage
SetFileInformationByHandle to get the kernel to automatically delete the output
file if our handle closes:
https://msdn.microsoft.com/en-us/library/windows/desktop/aa365539(v=vs.85).aspx

When a process terminates, its handles are closed, and the deletion occurs.
With this strategy, we probably don't even need to do the file rename.

On Posix systems, we should continue doing the atomic rename.
Quuxplusone commented 7 years ago

LLVM has two classes for doing this today: tool_output_file and FileOutputBuffer. One is for clients that want a raw_fd_ostream, and one is for clients that want memory mapped I/O. We should factor out the common logic for managing the FD separately from the stream or mapped_file_region.

Quuxplusone commented 7 years ago

On Posix systems, we should continue doing the atomic rename.

And use O_TMPFILE when available :-)

Quuxplusone commented 7 years ago
Here's an example of how this affects tablegen when I interrupt my ninja builds:

FAILED:
lib/Target/Hexagon/CMakeFiles/LLVMHexagonCodeGen.dir/HexagonGenPredicate.cpp.obj
C:\PROGRA~2\MICROS~1.0\VC\bin\AMD64_~2\cl.exe  /nologo ...
C:\src\llvm-
project\build_x86_stage1\lib\Target\Hexagon\HexagonGenInstrInfo.inc(13669):
fatal error C1004: unexpected end-of-file found
[40/1700] Building CXX object
lib\Target\Hexagon\CMakeFiles\LLVMHexagonCodeGen.dir\HexagonGenMux.cpp.obj

This persists until I force tablegen to re-run by deleting bin/llvm-tblgen.exe.
If we had atomic output file writing for tablegen, I wouldn't have to do this.