Closed jamesdbrock closed 2 years ago
digit 10000 |
---|
runParser many digit 10000
mean = 6.62 ms stddev = 7.42 ms min = 4.53 ms max = 53.91 ms |
runParser Array.many digit 10000
mean = 65.46 ms stddev = 6.48 ms min = 59.97 ms max = 83.47 ms |
runParser manyArray digit 10000
mean = 5.57 ms stddev = 1.85 ms min = 4.97 ms max = 17.70 ms |
StringParser manyRec CodePoints.anyDigit 10000
mean = 6.96 ms stddev = 4.69 ms min = 4.57 ms max = 21.94 ms |
StringParser manyRec CodeUnits.anyDigit 10000
mean = 5.05 ms stddev = 389.23 μs min = 4.52 ms max = 8.14 ms |
Regex.match \d* 10000
mean = 369.43 μs stddev = 140.75 μs min = 329.89 μs max = 1.77 ms |
many anyChar 10000 |
runParser many anyChar 10000
mean = 6.73 ms stddev = 558.41 μs min = 6.15 ms max = 10.17 ms |
runParser Array.many anyChar 10000
mean = 67.00 ms stddev = 7.13 ms min = 59.64 ms max = 89.04 ms |
runParser manyArray anyChar 10000
mean = 5.78 ms stddev = 1.41 ms min = 5.35 ms max = 14.63 ms |
skipMany anyChar 10000 |
runParser skipMany anyChar 10000
mean = 3.53 ms stddev = 468.27 μs min = 3.33 ms max = 6.28 ms |
Maybe a Combinators.Array
module would be good. We should probably have Array
variations of all of the List
combinators.
In my experience, using unsafe ST is not necessary, but regardless I wouldn’t recommend it. I originally did that in the language-cst parser, but it was “safe” because the internals of the parser were not exported. I switched to building a reversed List, then using Array.fromUnfoldable and Array.reverse. Both of those array operations are extremely cheap. Array.fromUnfoldable specifically builds an array with mutation and does not need to copy the final result. Array.reverse is much cheaper than List.reverse.
Interesting, thanks @natefaubion
To clarify, I did not notice any difference in performance when I tested the change in the CST parser. ST still has to allocate effect closures and binds at each step, which will be equivalent to allocating Cons cells.
Here is the benchmark for a manyFFI
function which completely cheats. This is a reference for a practical lower bound for the runtime for a many
combinator which exploits the fast V8 array push
.
manyFFI :: forall s m a. ParserT s m a -> ParserT s m (Array a)
manyFFI p = do
let result = [] :: Array a
flip tailRecM unit $ \_ -> alt
do
x <- try p
pure $ runFn2 unsafePush x result
pure (Loop unit)
do
pure (Done unit)
pure result
foreign import unsafePush :: forall a. Fn2 a (Array a) Unit
export const unsafePush = (x, xs) => { xs.push(x);}
many anyChar 10000 |
---|
runParser many anyChar 10000
mean = 6.58 ms stddev = 524.99 μs min = 6.09 ms max = 9.90 ms |
runParser Array.many anyChar 10000
mean = 64.95 ms stddev = 5.84 ms min = 60.42 ms max = 83.06 ms |
runParser manyArray anyChar 10000
mean = 5.99 ms stddev = 914.83 μs min = 5.33 ms max = 10.48 ms |
runParser manyFFI anyChar 10000
mean = 5.27 ms stddev = 538.36 μs min = 4.90 ms max = 8.01 ms |
Here is manyArray
implemented with Array.fromFoldable
and Array.reverse
as @natefaubion suggested. Really good speed. Maybe better than the unsafeLiftST
version.
manyArray :: forall s m a. ParserT s m a -> ParserT s m (Array a)
manyArray p = do
rlist <- flip tailRecM Nil $ \xs -> alt
do
x <- try p
pure (Loop (x:xs))
do
pure (Done xs)
pure $ Array.reverse $ Array.fromFoldable rlist
digit 10000 |
---|
many anyChar 10000 |
runParser many anyChar 10000
mean = 6.65 ms stddev = 574.12 μs min = 6.10 ms max = 10.26 ms |
runParser Array.many anyChar 10000
mean = 65.44 ms stddev = 7.05 ms min = 59.04 ms max = 84.52 ms |
runParser manyArray anyChar 10000
mean = 5.72 ms stddev = 1.75 ms min = 4.67 ms max = 15.57 ms |
runParser manyFFI anyChar 10000
mean = 5.33 ms stddev = 665.11 μs min = 4.89 ms max = 8.93 ms |
New module Parsing.Combinators.Array for fast combinator variations which return
Array
.See benchmarks in comment below.
Checklist: