sys::fs::status can fail on Windows with unexpected `permission_denied` error code

z2oh commented 7 months ago

This is another instance of the same root cause of https://github.com/llvm/llvm-project/issues/83046

On Windows, if a file is queried with GetFileAttributesW after it has been marked for deletion (but not yet deleted), the query will fail with the error code being set to ERROR_ACCESS_DENIED.

Apple's symbol index generation code writes index unit files out with temporary names, and then calls rename to place the file in the final location. Multiple clang processes may produce the same index unit file, and they race to write the file out to the final destination (overwriting is okay, because the unit file is the same). But before generating index information, the final destination file is first checked to see if it's up-to-date which would avoid doing work to generate duplicate information. This check is done by calling fs::status on the final destination file, and if this occurs while another clang process has just called to rename its temporary index file to the final destination file, there is a small window where the destination file is marked for deletion and fs::status returns an unexpected permission_denied error.

Ideally, I think the status function should detect this case on Windows and return file_not_found instead of permission_denied, but I initially had trouble finding a way to do this. I filed https://github.com/apple/llvm-project/pull/8577 to treat permission_denied as file_not_found in this specific case (which fixes the build failures), but this solution is not very satisfying in that it leaks Windows-specific idiosyncrasies.

However, a colleague pointed out that the underlying NTSTATUS code actually does disambiguate this case with 0xC0000056 STATUS_DELETE_PENDING (which is mapped to Win32 ERROR_ACCESS_DENIED). Querying this error code is not super straightforward (at least I couldn't find a straightforward way), but I was able to identify and predicate on this case in Windows-specific code with the following patch:

diff --git a/llvm/lib/Support/CMakeLists.txt b/llvm/lib/Support/CMakeLists.txt
index fbbb27a7c133..54d6d288e017 100644
--- a/llvm/lib/Support/CMakeLists.txt
+++ b/llvm/lib/Support/CMakeLists.txt
@@ -40,7 +40,7 @@ endif()
 if( MSVC OR MINGW )
   # libuuid required for FOLDERID_Profile usage in lib/Support/Windows/Path.inc.
   # advapi32 required for CryptAcquireContextW in lib/Support/Windows/Path.inc.
-  set(system_libs ${system_libs} psapi shell32 ole32 uuid advapi32)
+  set(system_libs ${system_libs} psapi shell32 ole32 uuid advapi32 ntdll)
 elseif( CMAKE_HOST_UNIX )
   if( HAVE_LIBRT )
     set(system_libs ${system_libs} rt)
diff --git a/llvm/lib/Support/Windows/Path.inc b/llvm/lib/Support/Windows/Path.inc
index 1b157fa52bad..10af4905e26c 100644
--- a/llvm/lib/Support/Windows/Path.inc
+++ b/llvm/lib/Support/Windows/Path.inc
@@ -24,10 +24,18 @@

 // These two headers must be included last, and make sure shlobj is required
 // after Windows.h to make sure it picks up our definition of _WIN32_WINNT
+#define WIN32_NO_STATUS
 #include "llvm/Support/Windows/WindowsSupport.h"
+#undef WIN32_NO_STATUS
+
+#include <winternl.h>
+#include <ntstatus.h>
 #include <shellapi.h>
 #include <shlobj.h>

+extern "C" NTSYSAPI NTSTATUS NTAPI RtlGetLastNtStatus();
+
 #undef max

 // MinGW doesn't define this.
@@ -769,8 +777,13 @@ std::error_code status(const Twine &path, file_status &result, bool Follow) {
     return ec;

   DWORD attr = ::GetFileAttributesW(path_utf16.begin());
-  if (attr == INVALID_FILE_ATTRIBUTES)
+  if (attr == INVALID_FILE_ATTRIBUTES) {
+    NTSTATUS last_nt_status = RtlGetLastNtStatus();
+    if (last_nt_status == STATUS_DELETE_PENDING)
+      return mapWindowsError(ERROR_FILE_NOT_FOUND);
     return getStatus(INVALID_HANDLE_VALUE, result);
+  }

   DWORD Flags = FILE_FLAG_BACKUP_SEMANTICS;
   // Handle reparse points.

I prefer this solution to the alternative because the additional complexity is nicely encapsulated, but it is not without caveats, namely having to link ntdll and a lack of confidence in compatibility w.r.t RtlGetLastNtStatus (I couldn't find any Microsoft documentation on this function).

Maybe I'm missing something and there's an easy way to inspect this error code? Is the cost of this solution as it stands too high?

cc @compnerd

compnerd commented 7 months ago

CC: @rnk @aganea

compnerd commented 7 months ago

I do find this to be a nicer solution as we are more precisely able to identify the scenario rather than guessing. I expect that ntdll.dll to be mapped in, so the additional linkage shouldn't be too expensive, but I know that @aganea did prefer to minimise the dependencies.

aganea commented 7 months ago

I do find this to be a nicer solution as we are more precisely able to identify the scenario rather than guessing. I expect that ntdll.dll to be mapped in, so the additional linkage shouldn't be too expensive, but I know that @aganea did prefer to minimise the dependencies.

Yes it is already mapped at runtime through advapi32. I think the cost for adding ntdll would be negligible.

My understanding is that RtlGetLastNtStatus() isn't exposed because it's internal detail that wraps NtCurrentTeb()->LastStatusValue. I think in most cases NtCurrentTeb()->LastStatusValue is in sync with LastErrorValue (which GetLastError() returns) but there's no guarantees.

One thing I dislike maybe with the above diff is that if status() returns ERROR_FILE_NOT_FOUND and in some other usage we create a file (with the same name) immediately afterwards, that creation might fail if the file wasn't deleted yet by the OS. The usual (silly) way of avoiding that is to loop until the GetFileAttributesW really tells you that the file is deleted. But that might take up to several seconds, it is all dependent on internal kernel threads/queues.

I suppose we could either return a new type file_type::being_deleted and handle the loop externally; or loop in status() if we detect STATUS_DELETE_PENDING and until GetLastError() returns ERROR_FILE_NOT_FOUND or something along those lines.

Also, the call to CreateFileW in status() could exhibit the same issue?

+@mstorsjo

z2oh commented 7 months ago

if status() returns ERROR_FILE_NOT_FOUND and in some other usage we create a file (with the same name) immediately afterwards, that creation might fail if the file wasn't deleted yet by the OS.

Isn't this true generally? In that there will always be a window between the return of status and the call to create file in which another process could have done something that would cause create file to fail? Though, I appreciate this is probably uncommon in practice and this patch would increase the likelihood of this case.

or loop in status() if we detect STATUS_DELETE_PENDING and until GetLastError() returns ERROR_FILE_NOT_FOUND or something along those lines.

This seems reasonable. I'd expect that window to be small, so the call is likely to succeed in a few iterations (or there's an actual error). I'll give this a try and make sure that assumption is true.

Also, the call to CreateFileW in status() could exhibit the same issue?

Hmm good point. I never encountered this in my testing.

rnk commented 7 months ago

I like the idea of using NtCurrentTeb()->LastStatusValue instead to get at the lower level NT error code. I think treating files pending deletion as "not found" is reasonable.

The only way I can see this going wrong is if a program checks for existence, gets file-not-found, and then proceeds to attempt to create a file at this location rather than attempting a different file name. However, such a program should recover from failing to create a unique file anyway after the file creation attempt, so that's not a big concern.

mstorsjo commented 7 months ago

I think I also would favour NtCurrentTeb()->LastStatusValue rather than explicitly linking against an undeclared function from ntdll.

I wonder if @cjacek have any wisdom to share here, about what's portable in practice and what's not, when it comes to constructs like these.

cjacek commented 7 months ago

Generally yes, I think that NtCurrentTeb()->LastStatusValue should be safe. I used Wine testing infrastructure to confirm that, see: https://testbot.winehq.org/JobDetails.pl?Key=145102. I modified existing Wine tests to check LastStatusValue (see patch.diff on that page) and it behaves as expected on all Windows versions from 7 to 11.

ntdll.dll is loaded to all Windows processes, that's where the startup code lives and that's what makes syscalls possible on Windows. Many of its functions are undocumented and claimed as "internal", but in practice so many applications depend on them one way or another that it's unrealistic for future Windows to change that fundamentally. The same applies to TEB layout.

RtlGetLastNtStatus is just a tiny helper that returns LastStatusValue from TEB. LastStatusValue is set whenever NT-style error status is converted to win32-style error using RtlNtStatusToDosError. Most of kernel32 functions are in fact relatively thin wrappers around ntdll functions and need to perform error code conversion on error. GetFileAttributesW does exactly that.

(Wine is based on black box testing, so the actual Windows code almost surely looks different, but externally observed behavior should match; compatibility varies between functions, but this one is extremely popular, so it's pretty well tested and it's relatively safe to assume that Windows is similar enough).

BTW, the delay in deletion comes from Windows semantic that the file is not removed as long as any process has any handle to the file still open, so it may be postponed indefinitely. (And when the last handle is closed, I'd expect deletion to be reflected in VFS immediately, without an additional time window when no handles are opened but file still being considered as in a process of removing, but I didn't test that to confirm).

z2oh commented 6 months ago

Thank you so much for the insight! I'll switch to using NtCurrentTeb()->LastStatusValue and file a PR.

z2oh commented 6 months ago

I hit a snag here: the TEB structure in the Windows SDK doesn't expose its members, unlike the one defined in Wine. Given this, perhaps its better to call RtlGetLastNtStatus? Although I appreciate the downsides here. I am surprised at how tricky it is to get at the underlying NTSTATUS.

aganea commented 6 months ago

I would assume the error occurs on this line? https://github.com/apple/llvm-project/blob/next/clang/lib/Index/IndexUnitWriter.cpp#L247

With the domain knowledge, the llvm::errc::permission_denied seems to make sense in this case, it simply means "someone else (another process) is taking care of that unit file, I (this process) don't have to worry about it". In that case it'd best maybe to return true?

  if (std::error_code EC = llvm::sys::fs::status(UnitPath.c_str(), UnitStat)) {
    if (EC == llvm::errc::permission_denied)
      return true;
    if (EC != llvm::errc::no_such_file_or_directory) {
      llvm::raw_string_ostream Err(Error);
      Err << "could not access path '" << UnitPath
          << "': " << EC.message();
      return std::nullopt;
    }
    return false;
  }

That error could mean other things, like "I don't have ACL permissions to access that file" but regardless there's nothing more that this process can do in that case.

Now like you suggest, we could check RtlGetLastNtStatus() and act on STATUS_DELETE_PENDING. But if we return llvm::errc::no_such_file_or_directory from status() in that case, the current process will do more work uselessly, since the other process is probably in currently moving out of the way the old unit file, to then move the temporary file into place. That is, https://github.com/llvm/llvm-project/blob/main/llvm/lib/Support/Windows/Path.inc#L519-L523. But maybe it's ok to occasionally do double the work, in two different processes.

It feels overall that there's a higher level synchronization mechanism missing in the index generation. The process that wants to write the unit should acquire a global mutex/named semaphore, to advertise other processes that it is writing that unit, and others should not attempt to write it.

Getting back to status(), from an API level, if we returned llvm::errc::no_such_file_or_directory could be a bit misleading. Like @cjacek pointed out above, it could take a while until the file is deleted. If we advertise "the file is not there", other API consumers might want to create that file, and that creation might fail if the file is still pending to delete. I find for those cases llvm::errc::permission_denied to better reflect the reality. One thing we could do however since we know about STATUS_DELETE_PENDING, is looping/blocking in status() for a bit, like rename_handle() does, in a hope that the status will change to something else. Which it probably will in 99% of the cases. But the caller will have to deal with llvm::errc::permission_denied anyway if we want 100% certainty. Or we could just return a new error code ("delete_pending") that better reflects the reality, and lets API users deal with this uncertainty.

z2oh commented 6 months ago

Yes, that is the line in question and your patch would work. This is similar to https://github.com/apple/llvm-project/pull/8577, although returning true is better than my approach to treat the errors the same way.

I'm hesitant about the second interpretation of the error:

That error could mean other things, like "I don't have ACL permissions to access that file" but regardless there's nothing more that this process can do in that case.

It's true that there's nothing else the process could do here, but this error condition should probably be fatal, unlike the ambiguous delete_pending case. In this particular care, there will be a failure elsewhere when trying to actually write out the index file if the process really doesn't have permissions, but it seems wrong to me to generally to conflate these two conditions (as the Windows API does).

I agree it's not great to mislead consumers by reporting that the file is deleted when it is not, especially when it may never be deleted due to a stuck process or similar. Looping in status to cover 99% of the cases seems reasonable from an API perspective. If the loop completes, I think it's fair for the caller to fail if the error is actually unexpected (like in this case). Returning a new error code (delete_pending) here is even better. As an implementation detail, looping like this seems undesirable, but perhaps it's the best way. As you pointed out, this strategy is already used elsewhere with success.

I will prepare a patch with the looping solution returning a new delete_pending code if the loop completes unsuccessfully. Still open to any and all feedback/discussion.

To summarize the potential solutions:

Solution	Notes
Do nothing; callers must handle `permission_denied`	This conflates the `pending_delete` case with bona fide permission issues. This behavior is specific to the Windows implementation of the `status` function.
If the last ntstatus is `STATUS_DELETE_PENDING`, return `no_such_file_or_directory`	This can mislead API consumers because the file is not actually deleted, and a subsequent write may fail. Write failures should be handled anyway, as even if the file is truly deleted, another process could still interfere in the window before writing. That said, a stuck process could be holding on to the pending-deletion file handle, which would mean the return of `status` does not at all accurately reflect the state of the file.
If the last ntstatus is `STATUS_DELETE_PENDING`, keep trying for a while to see if the file is successfully deleted. If not, return a new error code `delete_pending`.	This will catch most cases where there is a race to rename some destination file. If a process is stuck, the `delete_pending` case is returned and accurately reflects the state of the file. Checking this in a loop is a bit undesirable, but has been used with success elsewhere in the filesystem API.
If the last ntstatus is `STATUS_DELETE_PENDING`, keep trying for a while to see if the file is successfully deleted. If not, continue to return `permission_denied`.	Same as above, but doesn't require a new error code to be handled.
If the last ntstatus is `STATUS_DELETE_PENDING`, immediately return a new error code `delete_pending`.	This avoids the looping of the above two solutions without loss-of-information in the return. The caller can loop if they want, but I suspect in reality this error code would often not be handled and races would continue to be fatal.

As a note on TEB and RtlGetLastNtStatus(), the offset of LastStatusValue in the TEB structure has been stable since at least Windows NT 3.1 (https://www.geoffchappell.com/studies/windows/km/ntoskrnl/inc/api/pebteb/teb/index.htm) so that's unlikely to change. That said, Chromium and Gecko both use RtlGetLastNtStatus() to avoid having to rely on this offset, so I think that is a better approach: https://github.com/mozilla/gecko-dev/blob/f6e3b81aac49e602f06c204f9278da30993cdc8a/ipc/chromium/src/chrome/common/ipc_channel_win.cc#L522C1-L541C2

llvm / llvm-project

sys::fs::status can fail on Windows with unexpected `permission_denied` error code #89137