cvc5 / cvc5-projects

1 stars 0 forks source link

The benchmarks about performance regressions #278

Open ConfZ opened 3 years ago

ConfZ commented 3 years ago

benchmarks.zip Statistics.csv

Hi. These days, I collected the benchmarks we found that would cause performance regressions. When running these cases, the new version(CVC4-1.9) is slower than the older one(CVC4-1.8, CVC4-1.7). We record all the runtime and the commits caused these issue to the CSV file. I will update this topic if I have new questions. CVC4-1.8 updated some new operators, like "str.in.re" to "str.in_re", so the Zip file includes two folders. The "reg" is for CVC4-1.7 and the "reg-1.8" is for CVC4-1.8 and CVC4-1.9.

Here I have some questions:

  1. Are these issues related to the random seed for string solver?
  2. How to set the random seed for string solver in CVC4?
  3. Most of these benchmarks can trigger the performance regression steadily. Could you please give me some reasons why these issues occur? It may help to make a deeper research.
  4. Are these issues hard to solve?

I will appreciate for your helpful answers.

ajreynol commented 3 years ago

Thanks for the benchmarks. I will keep them in mind for testing.

In general, strings + length constraints is an undecidable problem. Hence, cvc5 uses heuristics by default when dealing with such problems. It is somewhat expected that certain problems may become slower (or even time out) when a new version is released. In general, we aim to improve performance overall, but for individual benchmarks, this is not guaranteed.

Answering questions: 1 - they may be impacted by the random seed, but are more likely caused by other changes to the heuristics in the solver, 2 - you can set the random seed of the underlying SAT solver via --random-seed; you can also set another random seed --seed which is used within the reset of cvc5. 3 - as mentioned, the issues occur when the heuristics used by the solver changes. For instance, it may prioritize one kind of inference above another, or use a different ordering when considering variables. These can have a large impact on performance, and can even impact whether cvc5 will solve the problem at all 4 - yes, for any undecidable logic, it is hard to come up with the right heuristics

FYI, you can try other options for solving these benchmarks, one particularly useful option for strings is --strings-fmf. I typically try this option when cvc5 times out on a string benchmark by default.

ConfZ commented 3 years ago

Thanks for the benchmarks. I will keep them in mind for testing.

In general, strings + length constraints is an undecidable problem. Hence, cvc5 uses heuristics by default when dealing with such problems. It is somewhat expected that certain problems may become slower (or even time out) when a new version is released. In general, we aim to improve performance overall, but for individual benchmarks, this is not guaranteed.

Answering questions: 1 - they may be impacted by the random seed, but are more likely caused by other changes to the heuristics in the solver, 2 - you can set the random seed of the underlying SAT solver via --random-seed; you can also set another random seed --seed which is used within the reset of cvc5. 3 - as mentioned, the issues occur when the heuristics used by the solver changes. For instance, it may prioritize one kind of inference above another, or use a different ordering when considering variables. These can have a large impact on performance, and can even impact whether cvc5 will solve the problem at all 4 - yes, for any undecidable logic, it is hard to come up with the right heuristics

FYI, you can try other options for solving these benchmarks, one particularly useful option for strings is --strings-fmf. I typically try this option when cvc5 times out on a string benchmark by default.

Thank you very much for your reply. These details are very helpful!!