msys2 / msys2-pacman

A friendly fork of https://gitlab.archlinux.org/pacman/pacman
GNU General Public License v2.0
21 stars 12 forks source link

"checking available disk space" is slow #32

Open lazka opened 7 months ago

lazka commented 7 months ago

See https://github.com/msys2/MSYS2-packages/issues/4176

We should look into why it potentially is slow.

lazka commented 7 months ago

Here is the code: https://github.com/msys2/msys2-pacman/blob/4cfaf53950c1e2bbef7262e2e9b608f4f5a280d5/lib/libalpm/diskspace.c#L421

mati865 commented 4 months ago

I've tested it with MSYS2 installed on ST1000DX002 drive and reproduced the problem (D:/msys64 was added to MS Defender exclusion list). checking available disk space took a long time and I could hear the disk head doing a lot of work (100% disk usage inside task manager). In performance monitor System was attributed most of the disk usage.

Did the same experiment with MSYS2 extracted to Windows 11 Dev Drive created as VHDX on the same drive. This time checking available disk space was so fast I have almost missed the moment it started and finished despite not adding MSYS2 directory to MS Defender exclusion list. So it's either combination of Cygwin and NTFS or MS Defender.

So I ran more tests are here are the results:

real 0m34.083s user 0m2.984s sys 0m12.843s

MSYS2 located on Dev Drive VHDX on NTFS partition on HDD (with MS Defender exclusion)

$ time pacman -S mingw-w64-ucrt-x86_64-toolchain --noconfirm

real 0m32.008s user 0m3.451s sys 0m13.723s

MSYS2 located on NTFS partition on HDD (without MS Defender exclusion)

$ time pacman -S mingw-w64-ucrt-x86_64-toolchain --noconfirm

real 1m26.384s user 0m3.749s sys 0m43.123s

MSYS2 located on NTFS partition on HDD (with MS Defender exclusion)

$ time pacman -S mingw-w64-ucrt-x86_64-toolchain --noconfirm

real 1m45.756s # not a mistake, ran the test again (see below); checking available disk space was slow user 0m3.920s sys 0m42.916s

real 1m30.206s # no idea what is going on; checking available disk space wasn't slow but not super fast either user 0m3.968s sys 0m43.593s

- 980 PRO 2TB (about 60% filled):

MSYS2 located on Dev Drive VHDX on NTFS partition on NVMe (without MS Defender exclusion)

$ time pacman -S mingw-w64-ucrt-x86_64-toolchain --noconfirm

real 0m27.353s user 0m3.046s sys 0m13.015s

MSYS2 located on Dev Drive VHDX on NTFS partition on NVMe (with MS Defender exclusion)

$ time pacman -S mingw-w64-ucrt-x86_64-toolchain --noconfirm

real 0m26.005s user 0m3.015s sys 0m13.484s

MSYS2 located on NTFS partition on NMVe (without MS Defender exclusion)

$ time pacman -S mingw-w64-ucrt-x86_64-toolchain --noconfirm

real 0m47,567s user 0m3,295s sys 0m22,402s

MSYS2 located on NTFS partition on NMVe (with MS Defender exclusion)

$ time pacman -S mingw-w64-ucrt-x86_64-toolchain --noconfirm

real 0m43,201s user 0m3,625s sys 0m24,155s



Before the install I made sure the archives are cached with `pacman -Sw mingw-w64-ucrt-x86_64-toolchain --noconfirm` and `mingw-w64-ucrt-x86_64-toolchain` is not installed in all cases.
I think my words "NTFS is the worst main FS used by modern OS" have been proved (at least in this case). Also I find it surprising how little performance was gained on Dev Drive with NVMe vs HDD, maybe buffering in RAM has played significant role?
jeremyd2019 commented 7 hours ago

Did we ever attempt to prove why it's so slow? My understanding was that it was the cygwin stat call, but it seems from looking at the code the only stat is https://github.com/msys2/msys2-pacman/blob/2eabe53cc265d1a8c86e6621b40ca7250747d7bb/lib/libalpm/diskspace.c#L251 and I think that should only be invoked in a scenario when a package is already installed. Otherwise, it just adds up the sizes from the packages' file lists (grouped by mount point) and uses statvfs to see if there's enough free space. That should be quick.