haskell / filepath

Haskell FilePath core library
BSD 3-Clause "New" or "Revised" License
66 stars 33 forks source link

Performance improvements, up to 20x faster #183

Closed Bodigrim closed 1 year ago

Bodigrim commented 1 year ago

In one of my projects I've been trying to migrate from home-grown low-level helpers to filepath API (e. g., takeExtension fn == ".foo" instead of ".foo" `B.isSuffixOf` fn) but failed miserably: performance penalties were too harsh, especially on Windows.

I fixed some low-hanging fruits, and I think results are quite good: many functions are twice faster now, and System.OsPath on Windows is often 10x-20x faster. Here is a detailed report:

filepath (string)
  splitExtension (posix):
    2.01 μs ±  62 ns, 40% less than baseline
  splitExtension (windows):
    2.02 μs ±  51 ns, 76% less than baseline
  takeExtension (posix):
    643  ns ±  13 ns, 10% less than baseline
  takeExtension (windows):
    836  ns ±  26 ns, 84% less than baseline
  replaceExtension (posix):
    778  ns ±  31 ns, 27% less than baseline
  replaceExtension (windows):
    795  ns ±  19 ns, 25% less than baseline
  dropExtension (posix):
    2.01 μs ±  55 ns, 39% less than baseline
  dropExtension (windows):
    2.01 μs ±  61 ns, 75% less than baseline
  addExtension (posix):
    887  ns ±  26 ns,  5% less than baseline
  addExtension (windows):
    935  ns ±  30 ns,       same as baseline
  hasExtension (posix):
    638  ns ±  13 ns,       same as baseline
  hasExtension (windows):
    1.46 μs ±  27 ns, 76% less than baseline
  splitExtensions (posix):
    2.05 μs ±  58 ns, 38% less than baseline
  splitExtensions (windows):
    5.25 μs ± 113 ns, 46% less than baseline
  dropExtensions (posix):
    2.03 μs ±  57 ns, 38% less than baseline
  dropExtensions (windows):
    3.92 μs ± 105 ns, 53% less than baseline
  takeExtensions (posix):
    683  ns ±  27 ns,       same as baseline
  takeExtensions (windows):
    3.85 μs ± 129 ns, 37% less than baseline
  replaceExtensions (posix):
    661  ns ±  16 ns, 35% less than baseline
  replaceExtensions (windows):
    763  ns ±  20 ns, 26% less than baseline
  isExtensionOf (posix):
    705  ns ±  15 ns,       same as baseline
  isExtensionOf (windows):
    3.89 μs ± 111 ns, 37% less than baseline
  stripExtension (posix):
    593  ns ±  14 ns,       same as baseline
  stripExtension (windows):
    582  ns ±  13 ns,  3% less than baseline
  splitFileName (posix):
    1.26 μs ±  27 ns, 52% less than baseline
  splitFileName (windows):
    1.58 μs ±  53 ns, 74% less than baseline
  takeFileName (posix):
    689  ns ±  18 ns,       same as baseline
  takeFileName (windows):
    1.43 μs ±  55 ns, 76% less than baseline
  replaceFileName (posix):
    513  ns ±  13 ns, 43% less than baseline
  replaceFileName (windows):
    757  ns ±  14 ns, 10% less than baseline
  dropFileName (posix):
    1.29 μs ±  26 ns, 30% less than baseline
  dropFileName (windows):
    885  ns ±  29 ns, 59% less than baseline
  takeBaseName (posix):
    695  ns ±  13 ns, 18% less than baseline
  takeBaseName (windows):
    2.97 μs ± 103 ns, 78% less than baseline
  replaceBaseName (posix):
    437  ns ±  15 ns, 59% less than baseline
  replaceBaseName (windows):
    794  ns ±  27 ns,  9% less than baseline
  takeDirectory (posix):
    2.71 μs ±  52 ns, 22% less than baseline
  takeDirectory (windows):
    1.10 μs ±  39 ns, 49% less than baseline
  replaceDirectory (posix):
    1.91 μs ±  65 ns,  7% less than baseline
  replaceDirectory (windows):
    1.93 μs ±  61 ns,  9% less than baseline
  combine (posix):
    654  ns ±  16 ns, 28% less than baseline
  combine (windows):
    924  ns ±  26 ns,       same as baseline
  splitPath (posix):
    6.99 μs ± 215 ns,       same as baseline
  splitPath (windows):
    6.27 μs ± 205 ns,       same as baseline
  joinPath (posix):
    2.16 μs ±  42 ns,       same as baseline
  joinPath (windows):
    933  ns ±  28 ns,       same as baseline
  splitDirectories (posix):
    9.32 μs ± 235 ns,  5% less than baseline
  splitDirectories (windows):
    7.56 μs ± 206 ns,       same as baseline
  splitDrive (posix):
    785  ns ±  26 ns, 14% less than baseline
  splitDrive (windows):
    576  ns ±  18 ns, 10% less than baseline
  joinDrive (posix):
    722  ns ±  19 ns, 21% less than baseline
  joinDrive (windows):
    926  ns ±  26 ns,       same as baseline
  takeDrive (posix):
    22.9 ns ± 820 ps,       same as baseline
  takeDrive (windows):
    10.7 ns ± 236 ps,  3% less than baseline
  hasDrive (posix):
    9.78 ns ± 202 ps,       same as baseline
  hasDrive (windows):
    8.35 ns ± 298 ps,       same as baseline
  dropDrive (posix):
    418  ns ±  16 ns, 49% less than baseline
  dropDrive (windows):
    520  ns ± 9.0 ns, 36% less than baseline
  isDrive (posix):
    20.3 ns ± 442 ps,       same as baseline
  isDrive (windows):
    8.60 ns ± 248 ps,       same as baseline
  hasTrailingPathSeparator (posix):
    309  ns ± 7.3 ns,       same as baseline
  hasTrailingPathSeparator (windows):
    316  ns ± 6.7 ns,       same as baseline
  addTrailingPathSeparator (posix):
    1.89 μs ±  70 ns,       same as baseline
  addTrailingPathSeparator (windows):
    1.90 μs ±  67 ns,       same as baseline
  dropTrailingPathSeparator (posix):
    1.90 μs ±  69 ns,       same as baseline
  dropTrailingPathSeparator (windows):
    1.91 μs ±  67 ns,       same as baseline
  normalise (posix):
    13.7 μs ± 499 ns,       same as baseline
  normalise (windows):
    8.77 μs ± 229 ns,       same as baseline
  equalFilePath (posix):
    13.6 μs ± 412 ns,       same as baseline
  equalFilePath (windows):
    8.76 μs ± 214 ns,       same as baseline
  makeRelative (posix):
    14.5 μs ± 417 ns,       same as baseline
  makeRelative (windows):
    24.5 μs ± 864 ns,       same as baseline
  isRelative (posix):
    21.1 ns ± 454 ps,       same as baseline
  isRelative (windows):
    8.42 ns ± 242 ps,       same as baseline
  isAbsolute (posix):
    21.2 ns ± 800 ps,       same as baseline
  isAbsolute (windows):
    8.65 ns ± 260 ps,       same as baseline
  isValid (posix):
    876  ns ±  30 ns,       same as baseline
  isValid (windows):
    877  ns ±  27 ns,       same as baseline
  makeValid (posix):
    2.47 μs ±  87 ns,       same as baseline
  makeValid (windows):
    2.50 μs ±  58 ns,       same as baseline
  splitSearchPath (posix):
    5.39 μs ± 210 ns,       same as baseline
  splitSearchPath (windows):
    4.80 μs ± 112 ns,       same as baseline
filepath (AFPP)
  splitExtension (posix):
    108  ns ± 2.9 ns, 49% less than baseline
  splitExtension (windows):
    364  ns ±  14 ns, 89% less than baseline
  takeExtension (posix):
    62.0 ns ± 1.8 ns, 37% less than baseline
  takeExtension (windows):
    292  ns ± 7.7 ns, 90% less than baseline
  replaceExtension (posix):
    65.9 ns ± 2.5 ns, 28% less than baseline
  replaceExtension (windows):
    198  ns ± 7.1 ns, 28% less than baseline
  dropExtension (posix):
    81.7 ns ± 1.6 ns, 55% less than baseline
  dropExtension (windows):
    341  ns ±  13 ns, 89% less than baseline
  addExtension (posix):
    93.1 ns ± 2.6 ns,  3% less than baseline
  addExtension (windows):
    167  ns ± 3.4 ns,       same as baseline
  hasExtension (posix):
    59.2 ns ± 590 ps, 54% less than baseline
  hasExtension (windows):
    289  ns ± 6.7 ns, 91% less than baseline
  splitExtensions (posix):
    122  ns ± 3.3 ns, 40% less than baseline
  splitExtensions (windows):
    403  ns ±  13 ns, 87% less than baseline
  dropExtensions (posix):
    97.2 ns ± 3.7 ns, 44% less than baseline
  dropExtensions (windows):
    381  ns ±  14 ns, 88% less than baseline
  takeExtensions (posix):
    47.4 ns ± 1.7 ns, 43% less than baseline
  takeExtensions (windows):
    321  ns ± 7.0 ns, 89% less than baseline
  replaceExtensions (posix):
    68.4 ns ± 2.5 ns,       same as baseline
  replaceExtensions (windows):
    228  ns ± 7.0 ns, 11% less than baseline
  isExtensionOf (posix):
    53.2 ns ± 1.9 ns, 39% less than baseline
  isExtensionOf (windows):
    338  ns ± 7.9 ns, 89% less than baseline
  stripExtension (posix):
    14.4 ns ± 366 ps,       same as baseline
  stripExtension (windows):
    24.0 ns ± 656 ps,       same as baseline
  splitFileName (posix):
    63.6 ns ± 2.1 ns, 54% less than baseline
  splitFileName (windows):
    302  ns ± 6.7 ns, 90% less than baseline
  takeFileName (posix):
    66.6 ns ± 2.4 ns, 48% less than baseline
  takeFileName (windows):
    288  ns ± 6.5 ns, 91% less than baseline
  replaceFileName (posix):
    7.02 ns ± 246 ps,       same as baseline
  replaceFileName (windows):
    155  ns ± 3.6 ns, 94% less than baseline
  dropFileName (posix):
    51.5 ns ± 784 ps, 55% less than baseline
  dropFileName (windows):
    288  ns ± 7.3 ns, 91% less than baseline
  takeBaseName (posix):
    84.8 ns ± 1.7 ns, 57% less than baseline
  takeBaseName (windows):
    377  ns ± 7.1 ns, 89% less than baseline
  replaceBaseName (posix):
    27.6 ns ± 1.0 ns, 64% less than baseline
  replaceBaseName (windows):
    107  ns ± 4.0 ns, 48% less than baseline
  takeDirectory (posix):
    117  ns ± 3.5 ns, 39% less than baseline
  takeDirectory (windows):
    552  ns ±  13 ns, 91% less than baseline
  replaceDirectory (posix):
    1.11 μs ±  17 ns,       same as baseline
  replaceDirectory (windows):
    1.33 μs ±  29 ns,       same as baseline
  combine (posix):
    7.09 ns ± 204 ps,       same as baseline
  combine (windows):
    155  ns ± 3.6 ns, 94% less than baseline
  splitPath (posix):
    2.20 μs ±  52 ns,       same as baseline
  splitPath (windows):
    5.99 μs ± 219 ns, 31% less than baseline
  joinPath (posix):
    834  ns ±  27 ns,       same as baseline
  joinPath (windows):
    30.1 μs ± 1.0 μs, 27% less than baseline
  splitDirectories (posix):
    3.32 μs ± 105 ns,       same as baseline
  splitDirectories (windows):
    10.0 μs ± 209 ns, 25% less than baseline
  splitDrive (posix):
    67.1 ns ± 1.8 ns,       same as baseline
  splitDrive (windows):
    226  ns ± 7.0 ns, 92% less than baseline
  joinDrive (posix):
    37.0 ns ± 902 ps,       same as baseline
  joinDrive (windows):
    73.4 ns ± 1.8 ns,       same as baseline
  takeDrive (posix):
    21.5 ns ± 454 ps,       same as baseline
  takeDrive (windows):
    143  ns ± 3.9 ns, 95% less than baseline
  hasDrive (posix):
    21.8 ns ± 864 ps,       same as baseline
  hasDrive (windows):
    143  ns ± 3.6 ns, 95% less than baseline
  dropDrive (posix):
    43.0 ns ± 994 ps,       same as baseline
  dropDrive (windows):
    184  ns ± 6.7 ns, 93% less than baseline
  isDrive (posix):
    41.1 ns ± 1.1 ns,       same as baseline
  isDrive (windows):
    186  ns ± 6.9 ns, 93% less than baseline
  hasTrailingPathSeparator (posix):
    5.45 ns ± 212 ps,       same as baseline
  hasTrailingPathSeparator (windows):
    6.34 ns ± 204 ps,       same as baseline
  addTrailingPathSeparator (posix):
    36.7 ns ± 1.0 ns,       same as baseline
  addTrailingPathSeparator (windows):
    101  ns ± 668 ps,       same as baseline
  dropTrailingPathSeparator (posix):
    49.1 ns ± 1.7 ns,       same as baseline
  dropTrailingPathSeparator (windows):
    98.6 ns ± 3.2 ns,       same as baseline
  normalise (posix):
    5.50 μs ± 211 ns,       same as baseline
  normalise (windows):
    39.2 μs ± 538 ns, 27% less than baseline
  equalFilePath (posix):
    5.96 μs ± 208 ns,       same as baseline
  equalFilePath (windows):
    51.1 μs ± 1.7 μs, 21% less than baseline
  makeRelative (posix):
    6.03 μs ± 223 ns,       same as baseline
  makeRelative (windows):
    51.6 μs ± 1.6 μs, 24% less than baseline
  isRelative (posix):
    22.2 ns ± 834 ps, 16% less than baseline
  isRelative (windows):
    239  ns ± 6.5 ns, 92% less than baseline
  isAbsolute (posix):
    22.3 ns ± 800 ps, 15% less than baseline
  isAbsolute (windows):
    241  ns ± 6.5 ns, 92% less than baseline
  isValid (posix):
    15.1 ns ± 496 ps,       same as baseline
  isValid (windows):
    33.0 μs ± 819 ns,       same as baseline
  makeValid (posix):
    2.48 μs ±  69 ns,       same as baseline
  makeValid (windows):
    61.9 μs ± 1.9 μs, 19% less than baseline
  splitSearchPath (posix):
    1.59 μs ±  34 ns,       same as baseline
  splitSearchPath (windows):
    4.59 μs ± 103 ns,       same as baseline
hasufell commented 1 year ago

Excellent.

hasufell commented 1 year ago

What I wonder is whether some of this can be backported to the string-only filepath or whether the new API introduced performance regressions (a few functions were slightly rewritten).

hasufell commented 1 year ago

I get compilation error on 9.2.6:

/x86_64-linux/ghc-9.2.6/filepath-1.4.100.0/t/abstract-filepath/build/abstract-filepath/abstract-filepath-tmp/OsPathSpec.dyn_o )

tests/abstract-filepath/OsPathSpec.hs:56:64: error: [-Wincomplete-uni-patterns, -Werror=incomplete-uni-patterns]
    Pattern match(es) are non-exhaustive
    In a lambda abstraction:
        Patterns of type ‘Either EncodingException String’ not matched:
            Left _
   |
56 |     property $ \(padEven -> bs) -> (Posix.encodeWith ucs2le . (\(Right r) -> r) . Posix.decodeWith ucs2le . OS.PS . toShort) bs === Right (OS.PS . toShort $ bs))
   |                                                                ^^^^^^^^^^^^^^^

tests/abstract-filepath/OsPathSpec.hs:58:66: error: [-Wincomplete-uni-patterns, -Werror=incomplete-uni-patterns]
    Pattern match(es) are non-exhaustive
    In a lambda abstraction:
        Patterns of type ‘Either EncodingException String’ not matched:
            Left _
   |
58 |     property $ \(padEven -> bs) -> (Windows.encodeWith ucs2le . (\(Right r) -> r) . Windows.decodeWith ucs2le . OS.WS . toShort) bs
   |                                                                  ^^^^^^^^^^^^^^^
Error: cabal: Failed to build test:abstract-filepath from filepath-1.4.100.0.
Bodigrim commented 1 year ago

What I wonder is whether some of this can be backported to the string-only filepath or whether the new API introduced performance regressions (a few functions were slightly rewritten).

The changes improve performance both for the legacy API and for the new one.

I get compilation error on 9.2.6:

It's a warning for me, and present in the master branch as well.

hasufell commented 1 year ago

If you rebase against master and then push to this repo instead of your fork, we can benefit from the fixed CI.

Bodigrim commented 1 year ago

@hasufell any more comments / suggestions?

hasufell commented 1 year ago

https://hackage.haskell.org/package/filepath-1.4.100.2

hasufell commented 1 year ago

@Bodigrim we have a regression:

With this patch (1.4.100.2):

> W.splitFileName  "\\\\?\\A:\\fred"
("\\\\?\\A:\\fred", "")

With 1.4.100.0:

> W.splitFileName  "\\\\?\\A:\\fred"
("\\\\?\\A:\\", "fred")

This was by chance found by a property test (they are rather unreliable): https://github.com/haskell/filepath/actions/runs/4263893204/jobs/7421218609#step:5:1112

However, W.splitDrive match.

sternenseemann commented 6 months ago

@hasufell That regression was fixed in https://github.com/haskell/filepath/commit/6a1c98dc5eb8021107d59903bbbb60137a6f9dbd, correct?

I noticed at least one other change in behavior which I discovered by chance in the filepath-bytestring equivalency test suite. It appears e.g. when comparing filepath-1.4.100.4 (GHC 9.8.1) to filepath-1.4.2.2 (GHC 9.4.8):

$ ghci-9.4.8
GHCi, version 9.4.8: https://www.haskell.org/ghc/  :? for help
ghci> import qualified System.FilePath.Windows as W
ghci> W.takeDirectory "//?/a:"
"//?/a:"
$ ghci-9.8.1
GHCi, version 9.8.1: https://www.haskell.org/ghc/  :? for help
ghci> W.takeDirectory "//?/a:"
"//?/"

Can you comment whether this is also to be considered a regression?

hasufell commented 6 months ago

//?/ rarely behaves correctly, but yes, it's a regression.