Closed Halbaroth closed 2 months ago
EDIT: this comment is outdated. I completely rewrote this algorithm.
The algorithm I used to ensure the termination of the model generation isn't sufficient. Last Friday, I found very simple examples for which the generation loops infinitely. For instance:
(set-option :produce-models true)
(set-logic ALL)
(declare-datatypes ((u 0) (v 0) (w 0)) (
(
(c1 (d1 v))
(c2 (d2 u)))
(
(c3 (d3 u))
(c4 (d4 w)))
(
(c5 (d5 Int)))
))
(declare-const a u)
(assert ((_ is c1) a))
(check-sat)
(get-model)
In fact we cannot find an appropriate order with the information we got in the module Ty
. We need to pass all the nests at once during the parsing in order to perform a topological order on the nests:
type case = string * (string * t) list
type adt = string * case list * t list
let find_map (type c) =
let exception Found of c
in fun (p : 'a -> 'b -> c option) (h : ('a, 'b) Hashtbl.t) ->
try
Hashtbl.iter
(fun k v ->
match p k v with Some c -> raise @@ Found c | None -> ()
) h;
raise Not_found
with Found c -> c
let count_in_set (set : (string, case list * t list) Hashtbl.t) c =
Lists.sum
(fun (_, ty) ->
match ty with
| Tadt (name, _) when Hashtbl.mem set (Hstring.view name) -> 1
| _ -> 0
) (snd c)
let cons_weight (set : (string, case list * t list) Hashtbl.t) c1 c2 =
let c = count_in_set set c1 - count_in_set set c2 in
if c <> 0 then c
else
List.compare_lengths (snd c1) (snd c2)
let topological_sort (adts : adt list) =
let seens : adt list ref = ref [] in
let set : (string, case list * t list) Hashtbl.t = Hashtbl.create 17 in
List.iter (fun (name, cases, params) -> Hashtbl.add set name (cases, params)) adts;
while Hashtbl.length set > 0 do
let adt, (cases, params) =
find_map (fun adt (cases, params) ->
let b =
List.exists (fun (_ , case) ->
List.for_all (fun (_, ty) ->
match ty with
| Tadt (n, _) ->
not @@ Hashtbl.mem set (Hstring.view n)
| _ -> true
) case
) cases
in
if b then Some (adt, (cases, params)) else None
) set
in
let cases = List.stable_sort (cons_weight set) cases in
Hashtbl.remove set adt;
seens := (adt, cases, params) :: !seens
done;
List.rev !seens
let declare_adts adts =
let adts = topological_sort adts in
List.iter (fun (name, cases, params) ->
let name = Hstring.make name in
let cases =
List.map (fun (c, destrs) ->
let destrs =
List.map (fun (d, dty) -> Hstring.make d, dty) destrs
in
{ constr = Hstring.make c; destrs }
) cases
in
Decls.add name params (Adt cases)
) adts
It seems to be okay. I have still some cleanup to perform before submitting the PR for reviews.
I think that we also have to produce unique identifiers instead of Hstring
for constructors. As I explained to @bclement-ocp last week, I believed we could trigger a bug with some push/pop and ADT's declarations sharing constructor names.
This bug isn't easy to trigger because after a pop
statement, the Hstrings
of the constructors of a dead ADT are dead values for OCaml's runtime and the GC may free them before the declaration of a new ADT using these constructor names.
Finally, this PR is ready for a first review! I rewrote the algorithm for sorting constructors, see the new module Nest
.
As I explained offline, it's not easy to avoid a global hash table to store the total order of constructors with the current state of the codebase. I plan to write a better code after removing the legacy frontend.
The PR have been tested on ae-format
, UFDT
and QF_DT
and doesn't introduce regression or slow down the solver.
There is a bug in the module Nest
. I will investigate later.
I fixed the bug.
Waiting for an updated version to give formal approval; the Nest module looks OK. As discussed offline, if we can adapt the code to store priorities for types rather than constructors I think that would be better (especially w.r.t. polymorphism) but I would be OK with merging with the current algo.
I think I prefer to merge this early version and apply your suggestion in a separated PR. I will open an issue with example where the current algorithm fails to terminate.
@bclement-ocp is it good for you?
This PR implements the model generation for ADT. The model generation is done by the casesplit mechanism in
Adt_rel
.If
--enable-adts-cs
is present, we do casesplits by choosing a delayed destructor as before after asserting.If there is no more delayed constructors, we look for a domain which contains a tightenable constructor (that is a constructor without payload) and we choose the domain as small as possible as it does in the
Enum
theory.If we are in model generation, we can also choose constructors with payload (but we still choose first constructors without payload).
The termination of the model generation is a bit tricky in the case of mutually recursive ADT. Please see the tests added for some complicated examples. I hope that I caught all the corner cases. To ensure the termination, the basic idea is to sort ADT's constructors in the module
Ty
during the parsing and to use the fact that the SMT-LIB standard only accepts well-founded ADT.We choose constructors in domains with the following order:
This PR is rebased on #1087.