Extensive benchmark for reasoning about regular properties

This repository contains benchmark automata that can be used to evaluate libraries accepting non-deterministic finite automata in .mata format.

Usage

Similarly, to use .mata files you will need either some format converter or parser of either of the formats.

File formats

.mata is our own format (see AUTOMATAFORMAT.md) for various types of automata (AFA, NFA, DFA ,etc.)

Benchmark description

b-smt contains 330 string constraints from the Norn and SyGuS-qgen, collected in SMT-LIB benchmark, that fall in BRE. This includes following directories:
- bool_comb/ere/QF_SLIA_Norn
- bool_comb/ere/QF_S_sygus_qgen
b-regexcontains 500 problems, obtained analogously as in [MFPS'17] and [APLAS'21], from RegExLib. This includes following directories:
- email_filter
b-hand-made has 56 difficult handwritten problems from [PLDI'21] containing membership in regular expressions extended with intersection and complement. They encode (1) date and password problems, (2) problems where Boolean operations interact with concatenation and iteration, and (3) problems with exponential determinization. This includes following directories:
- bool_comb/ere/boolean_and_loops
- bool_comb/ere/date
- bool_comb/ere/det_blowup
- bool_comb/ere/password
b-armc-incl contains 171 language inclusion problems from runs of abstract regular model checking tools (verification of the bakery algorithm, bubble sort, and a producer- consumer system) of [CIAA'08]. This includes following directories:
- automata-inclusion
b-param has 8 parametric problems. Four are from [TACAS'13]. Another four are from [arXiv'17]. This includes following directories:
- bool_comb/cox
- bool_comb/intersect
z3-noodler contains automata obtained from the z3-noodler string solver together with the intended automata operations that are applied within the tool. The automata were generated during solving constraints from the SMT Competition and includes following directories:
- automatark: (the underlying SMT benchmark QF_S/20230329-automatark-lu)
presburger contains a subset of the NFAs from the benchmark for evaluation of deciding LIA using an automata-based decision procedure originally descibed in <a href="https://link.springer.com/chapter/10.1007/3-540-61064-2_27">CAAP'96 extended to integers. The nondeterminism in these automata is typically a consequence of an existential quantiﬁer ∃x in the input formula; this is usually expressed as a so called projection that corresponds to removing the tracks in the encoding corresponding to variable from the alphabet. Furthermore, the benchmark includes only NFAs that need to be determinized in order to compute the complement of their language (this corresponds to a negation in the input formula). The used LIA formulae can be found in the LIA SMT-COMP category, under the UltimateAutomizer 1 (generated by the tool UltimateAutomizer, see, e.g. <a href="https://link.springer.com/chapter/10.1007/978-3-642-39799-8_2">CAV'13) and tptp 2 (from the The TPTP Problem Library, see, e.g. <a href="https://link.springer.com/article/10.1007/s10817-017-9407-7">Journal of Automated Reasoning 2017 ) folders. This includes following directories:
- presburger/complement

Contributing

Our repositories are open for forking, enabling interested individuals to explore, modify, and build upon our codebase. We wish to extend this benchmark with other problems and sources that could enhance the comparison of all widely used and presented tools.

If you are interested in contributing your own benchmark we suggest following:

Fork this repository.
Add your own benchmark formulae, preferably in .mata format. See our format description above. If you need help with converting your format into one of our formats contact us.
Create a Pull Request from your fork. We will check your contribution, optionally request changes and if we are satisfied, we will merge the changes into our repository.
Please, in your pull request include the following:
- Describe your benchmark: how it was obtained, whether it was generated, the origin of the problem it models, etc.
- If your benchmark is part of some paper or other work, please include citation to these works.

Contact Us

Lukáš Holík (kilohsakul);
Ondřej Lengál (ondrik);
Vojtěch Havlena (vhavlena);
Martin Hruška (martinhruska);
Tomáš Fiedor (tfiedor);
David Chocholatý (Adda0);
Juraj Síč (jurajsic).

Acknowledgements

This work has been supported by the Czech Ministry of Education, Youth and Sports ERC.CZ project LL1908, and the FIT BUT internal project FIT-S-20-6427.

VeriFIT / nfa-bench

readme