Closed dcerys closed 7 years ago
If debug logging is enabled, I see a significant number of "CHECK" statements, which is generated by tpn/remove-superfluous
.
Work-in-progress update https://github.com/tmarble/pamela/commit/5bdd2b398d35798de4bade27c41b01e7348b48d4
Can confirm that there is no need to comment out the dbg-println
statements: the macro indeed does not eval the form when the selected debug threshold is not set.''
As simple test can be run with
(reset-gensym-generator)
(time (pamela "-v" "-v" "-l" "trace" "-i" "test/pamela/issue-115.pamela" "-o" "issue-115" "-f" "edn" "-t" "(test.main)" "htn"))
...
"Elapsed time: 6152.743897 msecs"
By adding criterium to the :test
scope it's possible to get much more statistically valid information:
pamela.cli> (require 'criterium.core)
pamela.cli> pamela.utils/*dbg-println-level*
1
pamela.cli> (reset-gensym-generator)
0
pamela.cli> (criterium.core/with-progress-reporting (criterium.core/bench (pamela "-v" "-v" "-l" "trace" "-i" "test/pamela/issue-115.pamela" "-o" "issue-115" "-f" "edn" "-t" "(test.main)" "htn") :samples 5))
...
Evaluation count : 5 in 5 samples of 1 calls.
Execution time mean : 1.012452 sec
Execution time std-deviation : 10.573710 ms
Execution time lower quantile : 999.635369 ms ( 2.5%)
Execution time upper quantile : 1.023445 sec (97.5%)
Overhead used : 1.982120 ns
Using 3 parallel statements debug level 4 and below runs for about 1 second. Level 5 runs for 3 seconds. Level 6 runs for about 22 seconds. Thus using the default debug level 3 should not incur excess overhead.
After removing the exponential complexity the performance results are evident:
trace 6, parallel 2
"Elapsed time: 3326.549989 msecs"
"Elapsed time: 3743.666514 msecs"
"Elapsed time: 3289.047951 msecs"
trace 6, parallel 3
"Elapsed time: 3315.804464 msecs"
"Elapsed time: 3864.005608 msecs"
"Elapsed time: 3393.669642 msecs"
trace 6, parallel 4
"Elapsed time: 3684.990509 msecs"
"Elapsed time: 4090.807653 msecs"
"Elapsed time: 3546.242385 msecs"
trace 6, parallel 5
"Elapsed time: 3924.724666 msecs"
"Elapsed time: 3925.048613 msecs"
"Elapsed time: 3949.488314 msecs"
trace 6, parallel 6
"Elapsed time: 4023.541322 msecs"
"Elapsed time: 4127.849754 msecs"
"Elapsed time: 4133.518941 msecs"
trace 6, parallel 7
"Elapsed time: 4191.771061 msecs"
"Elapsed time: 4162.683773 msecs"
"Elapsed time: 4399.626638 msecs"
trace 6, parallel 8
"Elapsed time: 4035.176058 msecs"
"Elapsed time: 4557.13223 msecs"
"Elapsed time: 4159.997504 msecs"
trace 6, parallel 9
"Elapsed time: 4159.997504 msecs"
"Elapsed time: 5000.702443 msecs"
"Elapsed time: 4273.945647 msecs"
Generating a HTN and TPN with a lengthy
parallel
statement takes a very long time. To make matters worse, the execution time seems to be proportional to the length ofparallel
statement. In the following test case, you can comment out some of the calls to(something-simple)
, and find that the execution time of the HTN/TPN generation is:(something-simple)
: 20 seconds(something-simple)
: 39 seconds(something-simple)
: 86 seconds(something-simple)
: 195 secondsNote that changing the
parallel
to asequence
changes the generation time to be roughly independent of the number of statements.Note that using
perform-15-minutes-of-activities
orperform-activities
as the root task results in a non-slow HTN/TPN generation, so the problem seems to be the combination of aparallel
called by one (or more)sequence
's.