Terrible performances on medium and big files

Harmos274 commented 3 years ago

Good morning,

I like tree-sitter haskell very much but it seems it considerably slows when a file pass a certain number of characters. I don't actually know if the cause is the file's pattern complexity or anything but this is very penalizing...

Here's an example of slow file if you wan't to reproduce it :

module Evaluator
    ( evaluate,
      evaluateRepl,
      evaluateDefines,
      Value (..),
      Context,
    ) where

import Text.Read (readMaybe)
import Data.Maybe (fromMaybe)
import Control.Exception (throw)

import qualified Data.Map.Strict as Map

import Parser (Expression (..))
import Exception (HExceptions (EvaluationException))

type Context = Map.Map String Value

data Function = Defined [String] Expression | Builtin ([Value] -> Value) | Spe (Context -> [Expression] -> Value)
data Value = Function Function | Number Int | String String | List [Value] | Nil

instance Show Value where
    show (Function _) = "#<procedure>"
    show (Number n)   = show n
    show (String s)   = s
    show (List   l)   = Evaluator.showList l
    show Nil          = "()"

showList :: [Value] -> String
showList []         = "()"
showList [x, Nil]   = '(' : show x ++ ")"
showList (first:xs) = '(' : show first ++ showList' xs

showList' :: [Value] -> String
showList' [v, Nil] = (' ': show v) ++ ")"
showList' [v]      = (" . " ++ show v) ++ ")"
showList' (v:xs)   = (' ' : show v) ++ showList' xs
showList' []       = ")"

evaluateDefines :: [Expression] -> Context
evaluateDefines = evaluateDefines' baseContext

evaluateDefines' :: Context -> [Expression] -> Context
evaluateDefines' c []                                  = c
evaluateDefines' c (Seq (Atom "define" : define) : xs) = evaluateDefines' (fst $ evaluateDefine c define) xs
evaluateDefines' c (_                            : xs) = evaluateDefines' c xs

evaluate :: [Expression] -> [Value]
evaluate = evaluate' baseContext

evaluate' :: Context -> [Expression] -> [Value]
evaluate' _ []                                  = []
evaluate' c (Seq (Atom "define" : define) : xs) = evaluate' (fst $ evaluateDefine c define) xs
evaluate' c (expr:xs)                           = evaluateExpr c expr : evaluate' c xs

evaluateRepl :: Context -> [Expression] -> (Context, [Value])
evaluateRepl = evaluateRepl' []

evaluateRepl' :: [Value] -> Context -> [Expression] -> (Context, [Value])
evaluateRepl' v c []                                  = (c, reverse v)
evaluateRepl' v c (Seq (Atom "define" : define) : xs) = evaluateRepl'' v xs $ evaluateDefine c define
evaluateRepl' v c (expr:xs)                           = evaluateRepl' (evaluateExpr c expr : v) c xs

evaluateRepl'' :: [Value] -> [Expression] -> (Context, String) -> (Context, [Value])
evaluateRepl'' v (expr:xs) (c, name) = evaluateRepl' (evaluateExpr c expr : String name : v) c xs
evaluateRepl'' v []        (c, name) = (c, reverse $ String name : v)

evaluateDefine :: Context -> [Expression] -> (Context, String)
evaluateDefine c [Atom symbol, expr]              = (Map.insert symbol (evaluateExpr c expr) c, symbol)
evaluateDefine c [Seq (Atom symbol : args), func] = (Map.insert symbol (createFunction args func) c, symbol)
evaluateDefine _ _                                = throw $ EvaluationException "define : Invalid arguments"

createFunction :: [Expression] -> Expression -> Value
createFunction args func = Function $ Defined (map asAtom args) func

evaluateExpr :: Context -> Expression -> Value
evaluateExpr _ (Quoted expr) = evaluateQuoted expr
evaluateExpr c (Seq exprs)   = evaluateSeq c exprs
evaluateExpr c (Atom a)      = evaluateAtom c a

evaluateAtom :: Context -> String -> Value
evaluateAtom c s = Map.lookup s c
                ?: ((Number <$> readMaybe s)
                ?: throw (EvaluationException (show s ++ " is not a variable")))

evaluateSeq :: Context -> [Expression] -> Value
evaluateSeq _ []        = Nil
evaluateSeq c (expr:xs) = evaluateSeq' c (evaluateExpr c expr) xs

evaluateSeq' :: Context -> Value -> [Expression] -> Value
evaluateSeq' c (Function (Spe s)) exprs = s c exprs
evaluateSeq' c v exprs                  = evaluateSeq'' c $ v:map (evaluateExpr c) exprs

evaluateSeq'' :: Context -> [Value] -> Value
evaluateSeq'' c (Function f : xs) = invokeFunction c f xs
evaluateSeq'' _ []                = Nil
evaluateSeq'' _ _                 = throw $ EvaluationException "Sequence is not a procedure"

evaluateQuoted :: Expression -> Value
evaluateQuoted (Atom a)   = evaluateQuotedAtom a
evaluateQuoted (Seq  [])  = Nil
evaluateQuoted (Seq  q)   = List $ evaluateQuotedSeq q
evaluateQuoted (Quoted q) = evaluateQuoted q

evaluateQuotedAtom :: String -> Value
evaluateQuotedAtom s = (Number <$> readMaybe s) ?: String s

evaluateQuotedSeq :: [Expression] -> [Value]
evaluateQuotedSeq = foldr ((:) . evaluateQuoted) [Nil]

invokeFunction :: Context -> Function -> [Value] -> Value
invokeFunction _ (Builtin b)            args = b args
invokeFunction c (Defined symbols func) args = evaluateExpr (functionContext c symbols args) func
invokeFunction _ (Spe _)                _    = throw $ EvaluationException "The impossible has happened"

functionContext :: Context -> [String] -> [Value] -> Context
functionContext c (symbol:sxs) (value:vxs) = functionContext (Map.insert symbol value c) sxs vxs
functionContext c []           []          = c
functionContext _ _            _           = throw $ EvaluationException "Invalid number of arguments"

baseContext :: Context
baseContext = Map.fromList builtins

builtins :: [(String, Value)]
builtins = [("+",      Function $ Builtin add),
            ("-",      Function $ Builtin sub),
            ("*",      Function $ Builtin mult),
            ("div",    Function $ Builtin division),
            ("mod",    Function $ Builtin modulo),
            ("<",      Function $ Builtin inferior),
            ("eq?",    Function $ Builtin eq),
            ("atom?",  Function $ Builtin atom),
            ("cons",   Function $ Builtin cons),
            ("car",    Function $ Builtin car),
            ("cdr",    Function $ Builtin cdr),
            ("cond",   Function $ Spe cond),
            ("lambda", Function $ Spe lambda),
            ("let"   , Function $ Spe slet),
            ("quote" , Function $ Spe quote),
            ("#t" ,    String "#t"),
            ("#f" ,    String "#f")
           ]

add :: [Value] -> Value
add = Number . sum . map asNumber

sub :: [Value] -> Value
sub [Number n]       = Number $ -n
sub (Number n:xs)    = Number $ foldl (-) n $ map asNumber xs
sub _                = throw $ EvaluationException "- : Invalid arguments"

mult :: [Value] -> Value
mult = Number . product . map asNumber

division :: [Value] -> Value
division [Number lhs, Number rhs] = Number $ quot lhs rhs
division [_         , _]          = throw $ EvaluationException "div : Invalid arguments"
division _                        = throw $ EvaluationException "div : Invalid number of arguments"

modulo :: [Value] -> Value
modulo [Number lhs, Number rhs] = Number $ mod lhs rhs
modulo [_         , _]          = throw $ EvaluationException "mod : Invalid arguments"
modulo _                        = throw $ EvaluationException "mod : Invalid number of arguments"

inferior :: [Value] -> Value
inferior [Number lhs, Number rhs] = fromBool $ (<) lhs rhs
inferior [_         , _]          = throw $ EvaluationException "< : Invalid arguments"
inferior _                        = throw $ EvaluationException "< : Invalid number of arguments"

cons :: [Value] -> Value
cons [List l, Nil] = List l
cons [lhs, List l] = List $ lhs:l
cons [lhs, rhs]    = List [lhs, rhs]
cons _             = throw $ EvaluationException "cons : Invalid number of arguments"

car :: [Value] -> Value
car [List (f : _)] = f
car _              = throw $ EvaluationException "car : Invalid arguments"

cdr :: [Value] -> Value
cdr [List [_, v]]  = v
cdr [List (_ : l)] = List l
cdr _              = throw $ EvaluationException "cdr : Invalid arguments"

cond :: Context -> [Expression] -> Value
cond c (Seq [expr, ret] : xs) = cond' c (evaluateExpr c expr) ret xs
cond _ _                      = throw $ EvaluationException "cond : invalid branch"

cond' :: Context -> Value -> Expression -> [Expression] -> Value
cond' c (String "#f") _   xs = cond c xs
cond' c _             ret _  = evaluateExpr c ret

eq :: [Value] -> Value
eq [Number lhs, Number rhs] | lhs == rhs = fromBool True
eq [String lhs, String rhs] | lhs == rhs = fromBool True
eq [Nil       , Nil       ]              = fromBool True
eq [_         , _         ]              = fromBool False
eq _                                     = throw $ EvaluationException "eq? : Invalid number of arguments"

atom :: [Value] -> Value
atom []       = throw $ EvaluationException "atom? : no argument"
atom [List _] = fromBool False
atom _        = fromBool True

lambda :: Context -> [Expression] -> Value
lambda _ [args, func] = lambda' args func
lambda _ _            = throw $ EvaluationException "lambda : Invalid number of arguments"

lambda' :: Expression -> Expression -> Value
lambda' (Seq args) func = Function $ Defined (map asAtom args) func
lambda' _ _             = throw $ EvaluationException "lambda : Invalid arguments"

slet :: Context -> [Expression] -> Value
slet c [Seq defs, expr] = evaluateExpr (letContext c defs) expr
slet _ _                = throw $ EvaluationException "let : Invalid number of arguments"

letContext :: Context -> [Expression] -> Context
letContext c (Seq [Atom key, value] : xs) = letContext (Map.insert key (evaluateExpr c value) c) xs
letContext c []                           = c
letContext _ _                            = throw $ EvaluationException "let : Invalid variable declaration"

quote :: Context -> [Expression] -> Value
quote _ [expr] = evaluateQuoted expr
quote _ _      = throw $ EvaluationException "quote : Invalid arguments"

fromBool :: Bool -> Value
fromBool True  = String "#t"
fromBool False = String "#f"

asAtom :: Expression -> String
asAtom (Atom a) = a
asAtom _        = throw $ EvaluationException "Invalid atom"

asNumber :: Value -> Int
asNumber (Number n) = n
asNumber v          = throw $ EvaluationException $ show v ++ " is not a number"

(?:) :: Maybe a -> a -> a
(?:) = flip fromMaybe

Configuration:

Nixos
Nvim 0.6.0-dev (upstream)
Tree-sitter 0.19.3

Thank you for your help !

tek commented 3 years ago

I'm aware, haven't gotten around yet to debugging it!

Harmos274 commented 3 years ago

Nice, thank you ! Can't wait 😄

JonathanLorimer commented 3 years ago

I was trying to read up on some potential performance bottlenecks. The docs mention the tree_sitter_my_language_external_scanner_serialize function, and particularly the state it needs to serialize.

This function https://github.com/tree-sitter/tree-sitter-haskell/blob/master/src/scanner.cc#L1645-L1654

I was also trying to look into ways to benchmark tree-sitter, but couldn't find anything. If you have any tips on how to do it (or resources), I would love to aggregate some data to try and identify cost centers.

tek commented 3 years ago

yeah I think the state couldn't really be any simpler :slightly_frowning_face: my suspicion is that the external scanner shouldn't be invoked on every token, but changing that might not be possible.

I also haven't managed to find any useful debugging tools.

JonathanLorimer commented 3 years ago

Hmmmm, damn. Yeah, I assumed the state would be simple. Its hard for me to imagine that something like typescript (which has good performance) would have simpler state than haskell.

Also I found these:

Perhaps the haskell grammar is having a harder time taking advantage of caching? Although that seems really unlikely to me.

Also, if its helpful, the performance degradation only happens once I enter insert mode.

tek commented 3 years ago

Also, if its helpful, the performance degradation only happens once I enter insert mode.

I'm painfully aware :sweat_smile:

expipiplus1 commented 3 years ago

FWIW a simple reproducer is just a few hundred lines of a = a. At about 400 lines it stars to become noticeable for me.

JonathanLorimer commented 3 years ago

I generated the s-expression for a ~300 line haskell file, it seems large, but I don't have a frame of reference. You can preview the file here. Also if you generate the graph locally (you need graphviz) the tree looks unbalanced and very deep, I couldn't upload the file because it was 200mb and that's above githubs limit. The command to generate the debug graph is this tree-sitter parse --debug-graph profile.hs

tek commented 3 years ago

thanks for looking into this!

JonathanLorimer commented 3 years ago

FWIW a simple reproducer is just a few hundred lines of a = a. At about 400 lines it stars to become noticeable for me.

Interestingly a program like this is linear in the size of the state it creates relative to the number of expressions; an a = a expression generates a 4 line s-expression, so 400 of these create a 1600 line graph, which is about as performant as you can expect it to be without doing something very clever. See the full results here. Perhaps the nvim-treesitter client (the client I am using) needs to query / update the state less frequently? I imagine you could delay updating / reading from state until the user has exited insert mode? This is just a hypothesis.

expipiplus1 commented 3 years ago

It must be something other than that, there are other grammars which produce 1600 line expressions with no impact on editor performance.

JonathanLorimer commented 3 years ago

I wish I was more familiar with how tree-sitter parsers work (maybe I will investigate that next), but perhaps there is something specific to haskell that triggers re-computations of massive parts of the state space? A simple hypothesis would be that, based on the current grammar, tree sitter can't localize the changes to the expression where they are occurring and therefore has to recompute the whole graph? Or maybe the grammar is bigger than other languages, so the parser has to do more conditional checks?

tek commented 3 years ago

the hypothesis with the recomputation on change sounds very plausible

JonathanLorimer commented 3 years ago

My current working hypothesis leads me to believe that the issue is with nvim-treesitter then. I think the changes that are required to make this plugin usable are:

Update the state once after insertion, or at least heavily denounce it: https://github.com/nvim-treesitter/nvim-treesitter/issues/1944
Add some sort of caching: https://github.com/nvim-treesitter/nvim-treesitter/issues/1396#issuecomment-886997066
Not mentioned in any issues, but ensure that updates are localized to subtrees rather than the whole state.

tek commented 3 years ago

I'm surprised that the third point isn't mentioned – isn't that the biggest selling point of TS?

JonathanLorimer commented 3 years ago

Yeah, I thought so!

414owen commented 2 years ago

Also seeing lots of lag in the helix editor. The helix editor is written in Rust, and has tree-sitter support built in, so there's very little overhead. It also seems to be running tree-sitter synchronously with every inserted character, the result of which is that I can type faster than the editor updates.

I've uploaded a flamegraph of running helix and typing into a haskell buffer for a few seconds. Maybe that'll provide some insight? It seems to be spending rather a lot of time in tree_sitter_haskell_external_scanner_scan.

tek commented 2 years ago

that is helpful, thanks!

tek commented 2 years ago

since the graph showed that a lot of time is spent in logic::symop I tried replacing the parser combinators in there with a switch, since it was doing 10 checks on a single character in a row. alas, didn't change much.

one thing I noticed in nvim is that when holding a key in insert mode, the lag increases with the number of characters inserted. wonder if it would be possible to abort the current tree edit when it doesn't complete when the next character has arrived, or at least batch characters for the next edit

edit: looks like that only happens when the popupmenu is visible

tek commented 2 years ago

at some point I'll have to accept that writing an ad-hoc functional parser in c++ might not have been a suitable choice for the scanner

tek commented 2 years ago

I added timing output to the example project parsing script and that change in symop did in fact have an impact on performance, about 5-10% faster on semantic (45 vs 41 seconds)

JonathanLorimer commented 2 years ago

Is there a reason to have an external scanner? Are external scanners faster?

tek commented 2 years ago

no, the grammar doesn't handle some things like indentation well

tek commented 2 years ago

but feel free to try to move parts of the scanner to the grammar :smile:

JonathanLorimer commented 2 years ago

lol, yeah, I am terrible at C++, but am trying to think of any small contributions I could make that might improve performance.

414owen commented 2 years ago

In symop, if I'm interpreting the graph correctly, half the time is spent creating a closure for the scanner (operator+), then half the time is spent running the scanner. The creation step involves a lot of calls to malloc.

In the past, I've had excellent results using re2c to generate really fast C scanners, but creating a Haskell scanner sounds like a big project...

tek commented 2 years ago

@414owen can you create a new graph with current HEAD?

tek commented 2 years ago

and could you figure out a way to create that graph for a run of script/parse-example and teach me? :smile:

tek commented 2 years ago

In symop, if I'm interpreting the graph correctly, half the time is spent creating a closure for the scanner (operator+), then half the time is spent running the scanner. The creation step involves a lot of calls to malloc.

so we'll have to figure out how to inline those functions properly

414owen commented 2 years ago

Sure, I'll create one now. Re: mallocs/closures. Is it possible to create these closures in advance, so they're not happening on every keypress?

tek commented 2 years ago

Parser operator+(Parser fa, Parser fb) {
  return [=](State & state) {
    auto res = fa(state);
    return res.finished ? res : fb(state);
  };
}

if someone who's good at C++ can weigh in on the performance characteristics of this…

tek commented 2 years ago

funny, using const Parser & worsens the runtime by 25% :sweat_smile:

tek commented 2 years ago

and using [&] causes it to crash. there's got to be some combination of qualifiers that prevents allocation

414owen commented 2 years ago

~Okay, here's the flamegraph for parse-example.~ These are the instructions I followed to create it: https://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html

For the previous one I just used cargo-flamegraph.

On nixos everything you need is in the linuxPackages.perf and flamegraph packages.

edit flamegraph didn't come out right...

tek commented 2 years ago

thanks!

414owen commented 2 years ago

Um I don't seem to have anything in my examples folder, so parse-example wasn't doing much. Is there an example usage? I think it's best to flamegraph a binary, rather than a shell script, too.

tek commented 2 years ago

yeah you'll first have to run script/parse-examples, which clones those repos. I'm currently writing a separate script for creating a flamegraph that runs perf on tree-sitter parse directly

414owen commented 2 years ago

Exactly how pure are these parsers? For, eg, Parser symop(Symbolic type); could its Parser result be memoized for each Symbolic type?
Or even just created once for each Symbolic type at the start of the program?

tek commented 2 years ago

no idea whether that makes sense in c++!

tek commented 2 years ago

Parser is just an alias for function<Result(State&)>

414owen commented 2 years ago

I'm testing out some memoization tricks here

tek commented 2 years ago

interesting

tek commented 2 years ago

when I run flamegraph I mostly get [unknown] entries. any idea how to instruct tree-sitter to compile the parser with debug symbols?

tek commented 2 years ago

my impression so far is that since std::function is an object that stores all of its closure's captured variables, and most of those variables are again functions, and all of those functions are stack-allocated in other parser objects, there's just a lot of copying and allocations going on, especially when, as you noted, the parsers have value parameters like Symbolic::type and the current indent. std::function is probably not all that suited for functional programming

414owen commented 2 years ago

any idea how to instruct tree-sitter to compile the parser with debug symbols?

Doesn't tree-sitter just generate .c files, which you can then compile with cc -g? This is how helix is compiling tree-sitter grammars: https://github.com/helix-editor/helix/blob/a4641a8613bcbe4ad01d28d3d2a6f4509fef96a9/helix-syntax/build.rs#L91-L100

tek commented 2 years ago

pretty similar to the command line that tree-sitter parse uses :disappointed:

414owen commented 2 years ago

Here's a flamegraph for my memoized branch. It really didn't help that much.

tek commented 2 years ago

we really need a C++ pro to assess what the right way to use std::function is

on stackoverflow it's being said that those should be optimized away, but maybe not in the way I've used them

414owen commented 2 years ago

Yeah I stepped through a scan cycle in gdb, and pretty much everything seems to be in std::function...

at some point I'll have to accept that writing an ad-hoc functional parser in c++ might not have been a suitable choice for the scanner

I think this might be right on the money, unfortunately.

JonathanLorimer commented 2 years ago

I have asked my friend @avery-laird to take a look, he has lots of experience with C++. @tek What compiler are you using to compile the project? I imagine gcc.

tree-sitter / tree-sitter-haskell

Terrible performances on medium and big files #41