Closed vhavlena closed 3 months ago
No incorrect models on kepler
. I haven't checked on woorpje
as this benchmarks involves generation of models using stabilization-based procedure (and potentially other procedures). I get back to it later.
Seems there is no degradation with off models.
# of formulae: 103405
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------- ------ ---- --------- ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-61c998e-bdcb62d 102394 1011 11671.69 0.11 0.02 1.58 62912 39482 537 408 66 0
z3-noodler-9dacadd-bdcb62d 102393 1012 11618.30 0.11 0.02 1.53 62911 39482 537 409 66 0
cvc5-1.1.2 99774 3631 76479.19 0.77 0.00 6.77 60870 38904 2 3620 9 0
z3-4.13.0 97694 5711 132044.66 1.35 0.01 8.11 58876 38818 210 4968 463 70
--------------------------------------------------
Nice, can you also give the table for kepler with model generation on? To see the impact.
Also, are you handling the dummy symbol (minterm representing all symbols not used)? Because it should take some random not used value for models, I had some problems with it in the main decision procedure.
Sure
Benchmark kepler
# of formulae: 587
--------------------------------------------------
tool ✅ ❌ time avg med std sat unsat unknown TO MO+ERR other
-------------------------------- ---- ---- ------ ----- ----- ----- ----- ------- --------- ---- -------- -------
z3-noodler-model-61c998e-bdcb62d 583 4 53.73 0.09 0.02 0.18 283 300 0 3 0 1
z3-noodler-9dacadd-bdcb62d 584 3 15.64 0.03 0.01 0.07 284 300 0 3 0 0
cvc5-1.1.2 347 240 0.54 0.00 0.00 0.00 50 297 0 240 0 0
z3-4.13.0 278 309 24.64 0.09 0.01 0.87 135 143 0 285 24 0
--------------------------------------------------
The only unknown is caused by calling a different procedure where a model generation is not available yet.
It is slower because if a model is not required you can do quite aggressive heuristics. Nielsen transformation handles basically equations only. You work only with symbols occurring in the system and the dummy symbol is not used at all.
The second potential issue in this benchmark is that the models are quite large (32k characters for instance for a single variable). There is a space for improvement, but I don't think it is necessary at the moment.
The second potential issue in this benchmark is that the models are quite large (32k characters for instance for a single variable). There is a space for improvement, but I don't think it is necessary at the moment.
You mean that the resulting string is 32k character long? What is the benchmark?
The second potential issue in this benchmark is that the models are quite large (32k characters for instance for a single variable). There is a space for improvement, but I don't think it is necessary at the moment.
You mean that the resulting string is 32k character long? What is the benchmark?
Well, if you have constraints of the form |x| >= 32000
....
Can I merge it?
Can I merge it?
Yes
This PR introduces model generation for Nielsen-based procedure.