python / cpython

The Python programming language
https://www.python.org
Other
63.5k stars 30.41k forks source link

os.scandir() Windows bug dir_entry.stat() not works on file during writing. #85278

Open 3b1a8a2b-901f-4460-984e-861ec2eb671a opened 4 years ago

3b1a8a2b-901f-4460-984e-861ec2eb671a commented 4 years ago
BPO 41106
Nosy @pfmoore, @tjguk, @zware, @eryksun, @zooba
Files
  • s03_dir_entry.py
  • s03_dir_entry.py
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = None closed_at = None created_at = labels = ['3.9', '3.10', 'OS-windows', 'type-crash', 'docs'] title = 'os.scandir() Windows bug dir_entry.stat() not works on file during writing.' updated_at = user = 'https://bugs.python.org/CezaryWagner' ``` bugs.python.org fields: ```python activity = actor = 'eryksun' assignee = 'docs@python' closed = False closed_date = None closer = None components = ['Documentation', 'Windows'] creation = creator = 'Cezary.Wagner' dependencies = [] files = ['49259', '49260'] hgrepos = [] issue_num = 41106 keywords = [] message_count = 17.0 messages = ['372264', '372266', '372269', '372273', '372281', '372283', '372354', '372355', '372359', '372389', '372393', '372404', '372430', '372432', '372434', '372748', '372763'] nosy_count = 7.0 nosy_names = ['paul.moore', 'tim.golden', 'docs@python', 'zach.ware', 'eryksun', 'steve.dower', 'Cezary.Wagner'] pr_nums = [] priority = 'normal' resolution = None stage = None status = 'open' superseder = None type = 'crash' url = 'https://bugs.python.org/issue41106' versions = ['Python 3.9', 'Python 3.10'] ```

    3b1a8a2b-901f-4460-984e-861ec2eb671a commented 4 years ago

    I have problem with change detection of log during writing under Windows (normal fs and windows share). Probably bad order of Windows API calls - no idea.

    Test program is attached. You can reproduce it. Try with os.scandir() without os.stats() and os.stat().

    Source code responsible for it is probably this -> I do not understand CPython code -> https://github.com/python/cpython/blob/master/Modules/posixmodule.c.

    Here is full description - many test was done.

    # os.scandir() Windows bug dir_entry.stat() not works on file during writing. # Such files is for example application log. # No problem with os.stat()

    # Call of os.stat() before os.scandir() -> dir_entry.stat() is workaround. # Open file during writing other program "fixes" dir_entry.stat(). # Get properties on open file during writing "fixes" dir_entry.stat().

    # Notice that I run os.scandir() separately so dir_entry.stat() is not cached.

    # Steps to reproduce lack of modification update: # 1. Close all explorers or other application using PATH (it has impact). # 2. Set PATH to test folder can be directory or windows share. # 3. Run program without DO_STAT (False). # # Alternative steps (external app force valid modification date): # 4. run 'touch' or 'echo' on file should "fix" problem. 'echo' will throw error not matter. # # Alternative scenario (os.stat() force valid modification date - very slow): # 3. Run program without DO_STAT (True). No problems. # # Error result: # Modification date from dir_entry.stat() is stalled (not changing after modification) # if os.stat() or other Windows application not read file. # # Excepted result: # Modification date from dir_entry.stat() is update from separate calls os.scandir() # or cached if it is same os.scandir() call. # # Notice that os.scandir() must be call before dir_entry.stat() to avoid caching as described in documentation. # And this is done but not work on files during writing.. # # Ask question if you have since is very hard to find bug.

    3b1a8a2b-901f-4460-984e-861ec2eb671a commented 4 years ago

    Extra file for for tests with:

    DO_STAT = False

    See not changes but file was writing every second. If os.stat() run all between call os.scandir() all works.

    C:\root\Python38\python.exe C:/Users/Cezary.Wagner/PycharmProjects/dptr-monitoring-v2/sandbox/python/s13_dir_entry/s03_dir_entry.py dir_entry.stat() T:\\test.txt 1593017872.9109812 since last change 0.0009987354278564453 2020-06-24 18:57:52.911980 1593017872.91198 dir_entry.stat() T:\\test.txt 1593017872.9109812 since last change 1.0078418254852295 2020-06-24 18:57:53.918823 1593017873.918823 dir_entry.stat() T:\\test.txt 1593017872.9109812 since last change 2.0103507041931152 2020-06-24 18:57:54.921332 1593017874.921332 dir_entry.stat() T:\\test.txt 1593017872.9109812 since last change 3.023340940475464 2020-06-24 18:57:55.934322 1593017875.934322 dir_entry.stat() T:\\test.txt 1593017872.9109812 since last change 4.036783933639526 2020-06-24 18:57:56.947765 1593017876.947765 dir_entry.stat() T:\\test.txt 1593017872.9109812 since last change 5.049667835235596 2020-06-24 18:57:57.960649 1593017877.960649 dir_entry.stat() T:\\test.txt 1593017872.9109812 since last change 6.063947916030884 2020-06-24 18:57:58.974929 1593017878.974929 dir_entry.stat() T:\\test.txt 1593017872.9109812 since last change 7.0797247886657715 2020-06-24 18:57:59.990706 1593017879.990706 dir_entry.stat() T:\\test.txt 1593017872.9109812 since last change 8.091670751571655 2020-06-24 18:58:01.002652 1593017881.002652 dir_entry.stat() T:\\test.txt 1593017872.9109812 since last change 9.1053147315979 2020-06-24 18:58:02.016296 1593017882.016296 dir_entry.stat() T:\\test.txt 1593017872.9109812 since last change 10.120086908340454 2020-06-24 18:58:03.031068 1593017883.031068

    3b1a8a2b-901f-4460-984e-861ec2eb671a commented 4 years ago

    One hint more.

    Start of new process os.scandir() give invalid modification date for file open for writing until external tool is not called (like explorer, touch, etc.).

    So (log open for writing and write is done between 1, 2):

    1. Run program with os.scandir() -> dir_entry.stat().st_mtime() = t1.
    2. Run program with os.scandir() -> dir_entry.stat().st_mtime() = t1. Modification is stalled.

    Another scenario (log open for writing and write is done between 1, 3):

    1. Run program with os.scandir() -> dir_entry.stat().st_mtime() = t1.
    2. touch -> dir_entry.path
    3. Run program with os.scandir() -> dir_entry.stat().st_mtime() = t2. Modification works.
    zooba commented 4 years ago

    I'm going to have to spend more time to analyse this (later), but it seems like Windows deciding not to update the directory's data structures (containing the st_mtime retrieved by scandir) as long as the file is still open.

    I suspect the answer for your scenario is that you'll just have to use os.stat() to get the information from the file's entry, rather than the directory's entry. It's unlikely there's anything we can do at Python's level without sacrificing all the performance gains of scandir() for all other scenarios.

    eryksun commented 4 years ago

    In FSBO [1] section 6 "Time Stamps", note that the LastWriteTime value gets updated when an IRP_MJ_FLUSH_BUFFERS is processed. In the Windows API, this is a FlushFileBuffers [2] call. In the C runtime, it's a _commit [3] call, which is an os.fsync [4] call in Python. Calling the latter will update the directory entry for the file.

    For an example implementation in the FAT32 filesystem, see FatCommonFlushBuffers [5]. Note in the UserFileOpen case that it flushes any cached data via FatFlushFile and then updates the directory entry from the file control block (FCB) via FatUpdateDirentFromFcb, and finally it flushes the parent directory control blocks (DCBs) -- and possibly also the volume.

    Example with os.fsync:

        import os
        import time
        import datetime
    
        UPDATE_DIR = True
    
        FILEPATH = 'C:/Temp/test/spam.txt'
    
        def scan(filepath):
            dir_path, filename = os.path.split(filepath)
            with os.scandir(dir_path) as iter_dir:
                for entry in iter_dir:
                    if entry.name == filename:
                        return entry
    
        with open(FILEPATH, 'w') as f:
            while True:
                print('spam', file=f, flush=True)
                if UPDATE_DIR:
                    os.fsync(f.fileno())
                entry = scan(FILEPATH)
                stat_result = entry.stat()
                now = datetime.datetime.now()
                print(f'st_mtime: {stat_result.st_mtime:0.3f}, '
                      f'delta_t: {now.timestamp() - stat_result.st_mtime:0.3f}')
                time.sleep(1.0)

    [1] https://go.microsoft.com/fwlink/?LinkId=140636 [2] https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-flushfilebuffers [3] https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/commit?view=vs-2019 [4] https://docs.python.org/3/library/os.html#os.fsync [5] https://github.com/microsoft/Windows-driver-samples/blob/9afd93066dfd9db12f66099cf9ec44b6fd734b2d/filesys/fastfat/flush.c#L145

    zooba commented 4 years ago

    Does it make the most sense for us to make .flush() also do an implicit .fsync() (when it's actually a file object)?

    3b1a8a2b-901f-4460-984e-861ec2eb671a commented 4 years ago

    I do some test on linux all works - changes are detected and os.scandir() works but in Windows not - probably there is not unit test which check if os.scandir() is working on open files for writing.

    f.flush() no matter since file can be changed in external Python/Java/C#/C++, ... application - anyone can write logs in Windows. I will explain it in next comment. I just write this code to show only problem.

    Result from linux STAT = False so only repeat os.scandir() calls. Modification are detected correctly in Linux but not in Windows.

    [wagnecaz@nsdptrms01 \~]$ python3 s03_dir_entry.py dir_entry.stat() /home/wagnecaz/test.txt 1593085189.1000397 since last change 0.001112222671508789 2020-06-25 13:39:49.101368 1593085189.101368 dir_entry.stat() /home/wagnecaz/test.txt 1593085189.1000397 since last change 1.0028572082519531 2020-06-25 13:39:50.103111 1593085190.103111 dir_entry.stat() /home/wagnecaz/test.txt 1593085190.1020408 since last change 1.0026073455810547 2020-06-25 13:39:51.104881 1593085191.104881 dir_entry.stat() /home/wagnecaz/test.txt 1593085191.104042 since last change 1.0023958683013916 2020-06-25 13:39:52.106793 1593085192.106793 dir_entry.stat() /home/wagnecaz/test.txt 1593085192.106043 since last change 1.0023260116577148 2020-06-25 13:39:53.108582 1593085193.108582 dir_entry.stat() /home/wagnecaz/test.txt 1593085193.1080444 since last change 1.0021436214447021 2020-06-25 13:39:54.110500 1593085194.1105 dir_entry.stat() /home/wagnecaz/test.txt 1593085194.1100454 since last change 1.0013866424560547 2020-06-25 13:39:55.111684 1593085195.111684 dir_entry.stat() /home/wagnecaz/test.txt 1593085195.1110466 since last change 1.0022354125976562 2020-06-25 13:39:56.113542 1593085196.113542 dir_entry.stat() /home/wagnecaz/test.txt 1593085196.1130476 since last change 1.0021603107452393 2020-06-25 13:39:57.115450 1593085197.11545 dir_entry.stat() /home/wagnecaz/test.txt 1593085197.1140487 since last change 1.003014326095581

    Change is done every 1s and detected in Linux in Windows it is stalled. 2020-06-25 13:39:58.117287 1593085198.117287 dir_entry.stat() /home/wagnecaz/test.txt 1593085198.11605 since last change 1.002938985824585 2020-06-25 13:39:59.119224 1593085199.119224 dir_entry.stat() /home/wagnecaz/test.txt 1593085199.118051 since last change 1.0027978420257568 2020-06-25 13:40:00.121166 1593085200.121166

    3b1a8a2b-901f-4460-984e-861ec2eb671a commented 4 years ago

    Use case - detection of changes in open files is very important - log scanning - synchronization ...

    I think that first of all it is need good unit test to detect this problem (rare edge case - probably it is missed because hard to imagine that it can not work when file is open - I will miss this I think).

    It should work like this.

    First program is writing file under Windows and second program (unit test) is running os.scandir() if repeated os.scandir() detect changes it is O.K. (same like in Linux).

    To make it simpler it can be unit test in one program.

    1. Open test file in test directory.
    2. os.scandir() in test directory.
    3. Some writes to test file (f.write() with and without flush, ... - to be defined what is sufficient to test).
    4. os.scandir() in test directory - if change detected it O.K.
    5. f.close()

    I do not know Windows API now but I think we can detect id directory is changed between scans or we can detect if file is open (it is rare situation - rare edge case) in 90% all files will be closed.

    So if all files is closed current os.scandir() maybe is good (not I do not understand implementation to evaluate it correclty) and when one of file or more there is need another implementation which will detect modification.

    If you think I missed something please comment. You are welcome.

    3b1a8a2b-901f-4460-984e-861ec2eb671a commented 4 years ago

    I read some comments os.flush() or os.fsync() can be unrelated to problem. External application can be written in C# or whatever you want.

    Under Windows (not Linux) - modification dates will be stalled in such sequence. os.scandir() dir_entry.stat() # let it be dir_entry.path == 'test.txt' dir_entry.stat().st_mtime # will be for example 1 os.scandir() dir_entry.stat() # let it be dir_entry.path == 'test.txt' dir_entry.stat().st_mtime # will be STALLED for example 1

    Under Windows (not Linux) - modification dates will be refreshed in such sequence. os.scandir() dir_entry.stat() # let it be dir_entry.path == 'test.txt' dir_entry.stat().st_mtime # will be for example 1 os.stat('test.txt') # this code do something and it is not stalled in next call os.scandir() dir_entry.stat() # let it be dir_entry.path == 'test.txt' dir_entry.stat().st_mtime # will be CHANGED for example 2

    eryksun commented 4 years ago

    Does it make the most sense for us to make .flush() also do an implicit .fsync() (when it's actually a file object)?

    Standard I/O in the Windows C runtime supports a "c" commit mode that causes fflush to call _commit() on the underlying fd [1]. Perhaps Python should support a similar "c" or "s" mode that makes a flush implicitly call fsync / _commit.

    But you may not be in control of flushing the file if it's being written to by a third-party library or application. Calling os.[l]stat works around the problem, but only with NTFS. It doesn't help with FAT32 / exFAT.

    FAT filesystems update the last-write time when the file object is flushed or closed. It depends on the FO_FILE_MODIFIED flag in the file object or the CCB_FLAG_USER_SET_LAST_WRITE (from SetFileTime) in the file object's context control block (CCB). But opening, and even flushing, a file doesn't synchronize the context of other opens. Thus one can call os.stat (not even a scandir problem) repeatedly on a file and observe st_size changing while st_mtime remains constant:

        >>> filepath = 'C:/Mount/TestFat32/test/spam.txt'
        >>> f = open(filepath, 'w')
        >>> s = os.stat(filepath); s.st_size, s.st_mtime
        (0, 1593116028.0)
    
        >>> print('spam', file=f, flush=True)
        >>> s = os.stat(filepath); s.st_size, s.st_mtime
        (6, 1593116028.0)

    The last-write time gets updated by closing or flushing the kernel file object that was used to write to the file.

        >>> os.fsync(f.fileno())
        >>> s = os.stat(filepath); s.st_size, s.st_mtime
        (6, 1593116044.0)

    Another problem is stale entries for NTFS hard links, which can lead to getting a completely incorrect stat result via os.scandir -- wrong timestamps, wrong file size, and wrong file attributes.

    An NTFS file's MFT record contains its timestamps, size, and attributes in a $STANDARD_INFORMATION attribute. This reliable information is what os.[l]stat and os.fstat query. But it gets duplicated in per-link $FILE_NAME attributes that directories index. The duplicated info for a link gets synchronized to the standard info when the link is accessed, but other links to the file do not get updated, and their values may be completely wrong. For example (using the scan function from my previous post):

        >>> filepath1 = 'C:/Mount/TestNtfs/test/spam1.txt'
        >>> filepath2 = 'C:/Mount/TestNtfs/test/spam2.txt'
        >>> f = open(filepath1, 'w')
        >>> os.link(filepath1, filepath2)
        >>> s = scan(filepath2).stat(); s.st_size, s.st_mtime
        (0, 1593116055.7695396)
    
        >>> print('spam', file=f, flush=True)
        >>> s = scan(filepath2).stat(); s.st_size, s.st_mtime
        (0, 1593116055.7695396)
    
        >>> os.fsync(f.fileno())
        >>> s = scan(filepath2).stat(); s.st_size, s.st_mtime
        (0, 1593116055.7695396)
    
        >>> f.close()
        >>> s = scan(filepath2).stat(); s.st_size, s.st_mtime
        (0, 1593116055.7695396)

    As shown, flushing or closing the file object for the "spam1.txt" link is not reflected in the entry for the "spam2.txt" link. The directory entry for the link is only updated when the link is accessed:

        >>> f = open(filepath2)
        >>> s = scan(filepath2).stat(); s.st_size, s.st_mtime
        (6, 1593116062.2080283)

    [1] Linking commode.obj should enable commit-mode by default. But it's broken because __acrt_stdio_parse_mode is buggy. It initializes _stdio_mode to the global _commode value, but then it clobbers it when setting the required "r", "w", or "a" open mode.

    zooba commented 4 years ago

    Okay, so it sounds like there's a class of files where we can't rely on the FindFileData having the right values. But we get enough information to be able just suppress the caching behaviour for those, right?

    Basically, my criteria for fixing this in the runtime is that we should not add any new system calls during iteration, and cannot switch to always bypassing the cache for DirEntry.stat().

    What this probably means is if we can detect a link from the FFD struct (which I think we can?) then we can cache the attributes we trust and send .stat() through the real call.

    What it also means is that the "file still in use by another app" scenario will probably have to manually use os.stat(). We can't detect it, and it's the same race condition as calling os.stat() shortly before the update flushes anyway.

    I won't accept having to make a second set of system calls on every file just in case one of them is being modified by another application. That's not the normal case, and the point of scandir is to improve performance in the normal enumeration cases.

    Updating the documentation to mention/emphasise that some DirEntry.stat() fields may not update immediately, and so using os.stat() for current data is required, may be helpful. Though I think that's already implied by the line that says "Call os.stat() to fetch up-to-date information."

    So if someone wants to improve the docs, or has a way to recognise links (with unreliable data in the directory listing) and not pre-fill the stat object, feel free to submit a PR. Otherwise, unfortunately, we're pretty much bound by Windows's own optimisations here.

    eryksun commented 4 years ago

    What it also means is that the "file still in use by another app" scenario will probably have to manually use os.stat(). We can't detect it, and it's the same race condition as calling os.stat() shortly before the update flushes anyway.

    FAT filesystems require an fsync (FlushFileBuffers) or close on the in-use file in order to update the last-write time in both the directory entry and the file control block (i.e. FCB, which is shared by all opens). It seems the developers take the meaning of "last write" literally in terms of the last time that cached data was flushed to disk. Because the last-write time in the FCB is updated separately from the file size in the FCB, even an [l]stat on an in-use FAT file may see st_size change while st_mtime remains constant, as I showed in the previous post. No matter whether we query the directory or the FCB, the reported last-write time of a FAT file might be wrong from the standpoint of reasonable expectations.

    An fsync call is also useful with NTFS, but it only updates the directory entry of the opened link. It doesn't update other links to the file. On the other hand, with an NTFS file, calling os.[l]stat or os.fstat is sufficient to get updated stat information, regardless of the link that's accessed.

    What this probably means is if we can detect a link from the FFD struct (which I think we can?) then we can cache the attributes we trust and send .stat() through the real call.

    It would nice if we could detect the link count without an additional system call. But it's not in the duplicated information in the directory entry and wouldn't be reliable if it were. The link count is available via GetFileInformationByHandleEx: FileStandardInfo, but if you're calling CreateFileW to open the file, you may as well get the full stat result while you're at it.

    We're faced with the choice between either always calling the real lstat, or just documenting that files with hard links will have stale information if the file was updated using another link.

    zooba commented 4 years ago

    We're faced with the choice between either always calling the real lstat, or just documenting that files with hard links will have stale information if the file was updated using another link.

    That's an easy choice: we document it.

    The os module comes with the assumption that platform-specific behaviour may vary, so this is really just a helpful note about a known variation on Windows. It is not a warning, just a note.

    3b1a8a2b-901f-4460-984e-861ec2eb671a commented 4 years ago

    I think we can assume that NTFS is priority since that is the most used option.

    I can not discuss what with FAT32 or FAT since I am not the best in this domain (in NTFS I am not the best too now). Whatever I think that system must do allocation for open files to avoid conflicts so it can be tracked but how?

    Possible solutions is some extra function, argument for Windows - which makes cache dirty between calls.

    It is very dirty proposal - I need to think if it is good. Even used names is ugly I need think more about it. My imagination tells me that it can be good direction.

    dir_entry.stat(nt_force_cache_refresh=True) - it can be good for specific entries. os.scandir(nt_force_cache_refresh=True) - it is sometimes not need for all entries

    I am thinking: dir_entry.stat(nt_force_cache_refresh=True) should be faster than os.stat(dir_entry.path) instead dir_entry.stat() which not works fo open files. os.scandir(nt_force_cache_refresh=True) should be faster than dir_entry.stat(nt_force_cache_refresh=True) and dir_entry.stat() will work for open files. It is simpler to understand that Windows is different if such extra attribute must be added at all.

    nt_force_cache_refresh can add to dir_entry some information that .stat() should not use cache.

    Then best will be to not use nt_force_cache_refresh for open files - maybe you will find the way to detect open files in external application. I think Windows API allow to check if file is open - as far as remember sysinternals tools can do this so there some API for it I think.

    See this tool: https://docs.microsoft.com/en-us/sysinternals/downloads/handle - maybe there is source code for it or you can learn for it.

    Maybe you can check if file is open with use this API before dir_entry.stat()

    I do want to force any solution but just share some rough ideas.

    zooba commented 4 years ago

    Those are all good ideas, but using os.stat(d) instead of d.stat() is shorter, more reliable, more compatible, and already works.

    There's no middle ground where DirEntry can be faster, because it's already using that middle ground. All the discussion between Eryk and myself was figuring out whether we can use the DirEntry/FindFileData information to tell whether the file needs an explicit stat() or not, and we can't.

    Most of the performance impact of stat() is just in opening the file (which scandir() does not do). As soon as we have to directly access the file, we may as well get all the information from it. We're already getting all the "cheap" information we can.

    3b1a8a2b-901f-4460-984e-861ec2eb671a commented 4 years ago

    As far as I know os.stat() resets d.stat() maybe should be added some option to d.stat() to force update(). d.stat(nt_force_update=True).

    I am not sure if os.path.getmtime() can reset d.stat().

    os.stat() is 2x times slower than os.path.getmtime() and os.path.getmtime is 16x slower than d.stat(). MAJOR PROBLEM is PERFORMANCE of os.stat() since for directories with 1000 files it takes big number of seconds to read all stats - something wrong is here I think since Windows Explorer is doing it very fast.

    So I can not use os.stat() ONLY and it complicates code since I need to use os.stat() after d.stat() if files is OLDER THAN because if I use os.stat() the most program time will be these calls.

    Do you know which code makes such reset of d.stat()?

    If there is not possible optimization of there is need DOCUMENTATION update because it is really hard to understand why it is not working under windows some REMARKS can help me and others.

    I have still believe that some optimization is possible for Windows.

    Maybe it can be force to read stat by os.scandir(force_scan_stat=True) so all directory entries will be have cached stats before d.stat() is called. It can be faster I think since less calls from Python and probably better Windows API for it and same for Linux.

    I will study C code later if it is possible or write some snippet.

    eryksun commented 4 years ago

    As far as I know os.stat() resets d.stat() maybe should be added some option to d.stat() to force update(). d.stat(nt_force_update=True).

    It depends on the filesystem. NTFS will update the directory entry as soon as the link is accessed by CreateFileW. But that's relatively expensive, and actually one of the more expensive steps in an os.stat call.

    I am not sure if os.path.getmtime() can reset d.stat().

    genericpath.getmtime calls os.stat:

    https://github.com/python/cpython/blob/d0981e61a5869c48e0a70a512967558391272a93/Lib/genericpath.py#L53

    lexists, exists, getctime, getatime, getmtime, getsize, isdir, and isfile could be modified to call WinAPI GetFileAttributesExW [1], which is implemented via NtQueryFullAttributesFile [2], an optimized system call to get a file's network-open information. This can be significantly faster than the sequence of system calls that are required by os.stat. Note that this does not update the NTFS directory entry for the accessed link, unlike CreateFileW, but it does return updated information.

    The GetFileAttributesExW result would be used if the call succeeds and the file isn't a reparse point. Otherwise fall back on os.stat (win32_xstat_impl). If passed an fd, try GetFileInformationByHandleEx to get the FileBasicInfo and FileStandardInfo, or use a single system call via NTAPI NtQueryInformationFile: FileNetworkOpenInformation, which is the same info that GetFileAttributesExW returns.

    This could be implemented in C as nt._basic_stat(filename, follow_symlinks=True), where follow_symlinks means the expanded set of Windows name-surrogate reparse points. The C implementation would fall back on win32_xstat_impl. Note that a basic stat would not guarantee to return the following fields: st_ino, st_dev, and st_nlink.

    Alternatively, it could be implemented as a keyword-only basic=True option for os.stat, which would be ignored by POSIX. This way the high-level functions could continue to have a common implementation in genericpath.py.

    [1] https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-getfileattributesexw [2] https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/wdm/nf-wdm-zwqueryfullattributesfile