Open arbimo opened 2 years ago
I played a bit with the PDDLReader and I managed to get the time to ~1.3s on my machine (from ~6s) with the attached patch. I think there is still some opportunity for optimization of the grammar, but for very large files pyparsing performance is not on par with other parsers...
@arbimo Do you think we can close this or the performance is still not good enough?
After the PR, the runtimes are now mostly acceptable for the problems we have in tests. Those remain small compared to what can be found in many IPC benchmarks (not even speaking about the so called "hard-to-ground" instances).
So I suspect that it will show up again, rather sooner than later if people start exploiting the UP and at that time, it is better if we already have an open issue. We could also close this one and create a more targeted one with some pointers on how to improve the parser.
It seems that the PDDL parser has some important performance problems when parsing non trivial problems. Below is the runtime in seconds to parse some HDDL problems. Runtimes are often above one second and even go up to 23 seconds (!) for instance [23]. This instance in particular has a fairly big (though not completely unusual) initial state expression.
A (very limited) analysis show that the immense majority of the time is spent in the initial parsing (inside pyparsing itself). Below the cProfile for parsing a single domain, sorted by cumulative time.
I did not see an obvious fix for this: the expression parsing is actually the one defined by pyparsing (
nestedExpr
). I tried the apparently common trick ofenablePackrat
that actually increase parse time to 2 minutes for problem [23].To reproduce, you can replace the
test_hddl_parsing
method by this one. The commented lines would in addition give you a profiling of these.