llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.31k stars 11.69k forks source link

ninja "check-all" target creates directory named "%SystemDrive%" in build directory #54936

Open ormris opened 2 years ago

ormris commented 2 years ago

As far back as the llvm 13 release, the check-all target creates a directory called "%SystemDrive%" in the build directory when running on Windows. The contents are as follows.

%SystemDrive%/:
ProgramData

%SystemDrive%/ProgramData:
Microsoft

%SystemDrive%/ProgramData/Microsoft:
Windows

%SystemDrive%/ProgramData/Microsoft/Windows:
Caches

%SystemDrive%/ProgramData/Microsoft/Windows/Caches:
cversions.2.db                                                     {81FA3446-700E-4C68-8CFC-3266E1376F80}.2.ver0x0000000000000001.db
{6AF0698E-D558-4F6E-9B3C-3716689AF493}.2.ver0x0000000000000001.db  {DDF571F2-BE98-426D-8288-1A9A39C3FDA2}.2.ver0x0000000000000001.db

I wasn't able to reproduce this issue by running check-llvm, check-clang, check-lld, or check-sanitizer. Only check-all reproduced this.

Thanks in advance for any help on this one.

RKSimon commented 2 years ago

CC @RKSimon

ormris commented 2 years ago

Bisected this issue to the following commit. Unfortunately, running the tests for DirectoryWatcher does not reproduce the bug. I still have to run check-all to reproduce this behavior.

commit 76f1baa7875acd88bdd4b431eed6e2d2decfc0fe
Author: Saleem Abdulrasool <compnerd@compnerd.org>
Date:   Fri Jun 11 19:05:42 2021 -0700

    Revert "Revert "DirectoryWatcher: add an implementation for Windows""

    This reverts commit 0ec1cf13f2a4e31aa2c5ccc665c5fbdcd3a94577.

    Restore the implementation with some minor tweaks:
    - Use std::unique_ptr for the path instead of std::vector
      * Stylistic improvement as the buffer is already heap allocated, this
        just makes it clearer.
    - Correct the notification buffer allocation size
      * Memory usage fix: we were allocating 4x the computed size
    - Correct the passing of the buffer size to RDC
      * Memory usage fix: we were reporting 1/4th of the size
    - Convert the operation event to auto-reset
      * Bug Fix: we never reset the event
    - Remove `FILE_NOTIFY_CHANGE_LAST_ACCESS` from RDC events
      * Memory usage fix: we never needed this notification
    - Fold events for the notification action
      * Stylistic improvement to be clear how the events map
    - Update comment
      * Stylistic improvement to be clear what the RAII controls
    - Fix the race condition that was uncovered previously
      * We would return from the construction before the watcher thread
        began execution.  The test would then proceed to begin execution,
        and we would miss the initial notifications.  We now ensure that the
        watcher thread is initialized before we return.  This ensures that
        we do not miss the initial notifications.

    Running the test on a SSD was able to uncover the access pattern.  This
    now seems to pass reliably where it was previously flaky locally.
pogo59 commented 2 years ago

cc @compnerd for any insights on the original patch

compnerd commented 2 years ago

Hmm, the discrepancy between check-all and DirectoryWatcher test implies it is either the unit tests or more likely something else like clang-tools-extra that might be the cause. That seems like the environment strings are not being expanded before being given to the watcher? My guess is that it's a client side issue rather than the implementation.

ormris commented 2 years ago

My guess is that it's a client side issue rather than the implementation.

I've been able to reproduce this issue both on my personal and work machines. The configurations of those two machines are very different (other than being on a recent version of Windows 10). This makes me think that, while this behavior does appear to be Windows specific, this is unlikely to be a configuration issue.

compnerd commented 2 years ago

I think you misunderstood what client meant there - clients of the API (e.g. clangd). It might be useful to narrow it down further from the full test suite as it may yield further insights into what is happening.

ormris commented 2 years ago

I think you misunderstood what client meant there - clients of the API (e.g. clangd).

Ah makes sense.

clients of the API (e.g. clangd)

Does clangd use DirectoryWatcher? I'm not able to find any mention of the DirectoryWatcher header files in the monorepo, other than the unittests for the class. As mentioned above, the unittests don't reproduce the issue. I'm a bit at a loss for other things to try here, any suggestions would be welcome.

compnerd commented 2 years ago

I thought it did, but I might be wrong. The usage is going to be through the IndexStore and c-index-test IIRC.

ormris commented 2 years ago

I'm not able to find any use of DirectoryWatcher or a class named IndexStore in the current tree, but I did run an experiment that @pogo59 suggested. I removed the DirectoryWatcher unittest from the tree and ran a clean check-all. After removing this unittest, I'm no longer able to reproduce the issue. This seems to suggest that the unittests are triggering this behavior somehow, just not when run independently.

compnerd commented 2 years ago

Perhaps we can do something more clever - a combination of process monitor + windbg can help narrow down the codepath that is triggering this? The process monitor will identify the process and windbg can help catch it at the right point possibly?

ormris commented 2 years ago

OK, that sounds reasonable. Since we can tie this to a single executable, I should be able to capture everything we need.

dgg5503 commented 1 month ago

Hi @ormris ,

I believe this bug was fixed as part of https://github.com/llvm/llvm-project/pull/90478.

I've confirmed locally that %SystemDrive% does not appear in the root of my Ninja build directory on Windows after running check-all (tested from https://github.com/llvm/llvm-project/commit/a1af1de4380f9c4fa3b5229e9f4a41af93955c38).