Open DanielYang59 opened 1 month ago
Can you test with the same Python versions please? Also, could you avoid putting a loop inside the statement to test and rather use number=10000
for instance?
Thanks for the quick response @picnixz , I just updated the test results with Python 3.12.5 exactly :)
Also, could you avoid putting a loop inside the statement to test and rather use number=10000 for instance?
That was deliberate (I want to rule out the import time of os
), is there any pitfall for having a loop inside the test statement?
want to rule out the import time of os
You can rule out the import time by using setup='import os'
.
is there any pitfall for having a loop inside the test statement
Generally, no, but I'm not sure whether the garbage collector could do something inbetween, hence the question.
From my experience, Windows is generally slower when doing OS-related operations, so I'm not that shocked. Let's ask an expert on this topic: @zooba
I just observed that os.path.getsize
is simply calling os.stat
and then gets its corresponding field. So the problem (if any) is the slowness of os.stat
.
You can rule out the import time by using setup='import os'.
Thanks a lot for the input, both script and result updated.
From my experience, Windows is generally slower when doing OS-related operations, so I'm not that shocked.
I thought Windows is just "slightly" slower, but pretty surprised to see such a big gap.
Also perhaps calling getsize
a million times is a rare use case, in my case I was just trying to create a test file of specific size, and find the following code taking forever on my Windows machine:
with open(file_path, "w", encoding="utf-8", newline="") as f:
while os.path.getsize(file_path) < target_size:
f.write(f"This is line number {line_number}\n")
In my case, it's much faster to guess a total number of lines and avoid using getsize
after writing each line :)
I'd be interested in knowing whether this is really os.stat
that is 30x slower on Windows or not. For your specific use case, create a bytes object of the number of bytes you want, fill it with whatever you want and write it in binary mode and you should have a file of exact size.
For your specific use case, create a bytes object of the number of bytes you want
Yep, solid idea! I assume it would be much faster to check the size of the bytes object than the file :)
Please try running on a Dev Drive to compare. It's not quite free of the issues that make Windows have slower I/O than Linux, but it's significantly better than using your default OS drive.
I'd also be interested to know exactly which build of Windows you're running. One recent update includes a new API for getting file metadata that is implemented more like Linux (it doesn't require opening the file first, which Windows traditionally does). Python 3.12 should use the new API automatically, and some measurements have shown that it runs 3-4x faster than the old one.
But overall, the slow file system is an OS issue, probably not a Python issue. To see a Python issue, you'll need to do native profiling of Python itself and show that we're somehow going through significantly more of our own code on one OS than another. Simple timings of OS operations are not really comparable in that way.
Hi @zooba thanks a lot for the input and detailed explanation!
Please try running on a Dev Drive to compare.
It's indeed much faster (17 sec vs 31 sec).
I'd also be interested to know exactly which build of Windows you're running.
It should be Version: 23H2, OS build: 22631.4169
.
But overall, the slow file system is an OS issue, probably not a Python issue. To see a Python issue, you'll need to do native profiling of Python itself and show that we're somehow going through significantly more of our own code on one OS than another. Simple timings of OS operations are not really comparable in that way.
Fully understandable, thanks a lot for the input. But as an end user I don't quite know how to properly profiling Python, so opened this in case you can do something :)
But as an end user I don't quite know how to properly profiling Python, so opened this in case you can do something :)
At least on Windows, the approach is to use Windows Performance Recorder to capture a trace and then Windows Performance Analyzer to attribute the CPU time to either one of Python's native modules (you won't get Python-specific information in there yet, but I'll be releasing a tool soon to help with that) or an OS module.
It's quite a specialized job, I'll be honest! But there are people out there who know how to do it, and may also have the time and interest to see what's up (not me, right now).
It should be
Version: 23H2, OS build: 22631.4169
.
This doesn't have the new API in it, so you're getting the Dev Drive accelerated time, but not the improved stat calls. I believe Insider builds should have it already.
For reference, I just did python3.12 -m timeit -n 100 -s "import os" "sum(os.path.getsize(s) for s in os.scandir(r'C:\Windows\System32'))"
on a 22631 build and an unreleased 26100 build (both with Store install of 3.12.6) and got 178ms vs 57.8ms. So the new API should provide 2-3x speedup on this operation, and that should stack on top of the Dev Drive benefit (though I suspect part of the benefit is from bypassing the same drivers that Dev Drives disable, so it may not be a straight (1.5-2x) x (2-3x) = (3-6x) calculation).
Bug report
Bug description:
Summary
I noticed
os.path.getsize
runs much slower (38x) on Windows 11 than Ubuntu 22.04-WSL2 (Windows 11 and Ubuntu 22.04-WSL2 running on the same physical machine, the same SSD, both tested while idle) and MacOS Sonoma 14.6.1.Test code
Test results
On Windows 11 (Version: 23H2, OS build: 22631.4169):
Windows 11 (dev drive):
On Ubuntu 22.04 WSL2:
On MacOS 14.6:
CPython versions tested on:
3.12
Operating systems tested on:
Windows