Open rokicki opened 2 months ago
Here's some more information. First, if we fix tidyint on the example above, here are some runtimes:
tidyint seconds
1073741824 196.09
536870912 216.89
268435456 201.30
134217728 200.10
67108864 197.84
33554432 192.68
16777216 191.55
8388608 193.37
4194304 190.36
2097152 202.54
1048576 224.33
524288 271.45
262144 352.67
131072 529.34
65536 1031.40
32768 1619.12
16384 3077.74
8192 9003.65
For the default of 100, I would expect a runtime of greater than four days.
Next, if we use my tuning trick, which you can see as code here:
https://github.com/gap-packages/kbmag/commit/b2661cf8bb707db4bc8a07e7c6ae0b03f5c4cd9a
The different values of the divisor give runtimes as follows:
div seconds
1 254.07
2 256.69
3 278.61
4 296.12
5 316.09
6 330.75
7 342.76
8 371.53
9 414.66
10 379.60
11 413.01
12 425.20
13 452.45
14 463.77
15 467.58
16 501.13
20 529.56
30 754.84
40 904.07
50 1087.57
60 1295.49
70 1412.14
I believe 10 is a reasonable compromise.
The following input file makes kbmag run seemingly forever:
This is because the number of equations gets into the millions, and tidy is called every 100 new equations, and each call to tidy scans each of the old equations. So essentially kbmag is quadratic in the number of equations.
If you give the -t 1000000 option, the total runtime is under 6 minutes.
The Monoid Automata Factory automata program does not have this defect. (But MAF takes nearly six hours to fully process this file.)
I suggest it is meaningless to have tidyint be less than about 10% of the number of existing equations, since the reduction in equations is usually relatively small, and running tidy so frequently just kills the runtime. I suggest every time we run tidy, if the current number of equations is > 10 * tidyint, we go ahead and set tidyint to the number of equations divided by 10.
This will eliminate the quadratic runtime, not affect the runtime of quick runs at all, but keep tidyint reasonably constrained.
(In general, relying on user setting of tunables to get reasonable algorithmic complexity is bad practice; the code should take reasonable steps to keep its runtime as intended. Most users will not know how to set tunables.)
(There are other problems in build_wd_fsa, but we will get to those in good time.)