feat(ADT): model generation

Halbaroth commented 4 months ago

This PR implements the model generation for ADT. The model generation is done by the casesplit mechanism in Adt_rel.

If --enable-adts-cs is present, we do casesplits by choosing a delayed destructor as before after asserting.
If there is no more delayed constructors, we look for a domain which contains a tightenable constructor (that is a constructor without payload) and we choose the domain as small as possible as it does in the Enum theory.
If we are in model generation, we can also choose constructors with payload (but we still choose first constructors without payload).
The termination of the model generation is a bit tricky in the case of mutually recursive ADT. Please see the tests added for some complicated examples. I hope that I caught all the corner cases. To ensure the termination, the basic idea is to sort ADT's constructors in the module Ty during the parsing and to use the fact that the SMT-LIB standard only accepts well-founded ADT.

We choose constructors in domains with the following order:
- Constructor with the less destructors using the same nest;
- Constructor with the less destructors using another nest of the same mutually recursive declaration;

This PR is rebased on #1087.

Halbaroth commented 4 months ago

EDIT: this comment is outdated. I completely rewrote this algorithm.

The algorithm I used to ensure the termination of the model generation isn't sufficient. Last Friday, I found very simple examples for which the generation loops infinitely. For instance:

(set-option :produce-models true)
(set-logic ALL)
(declare-datatypes ((u 0) (v 0) (w 0)) (
  (
    (c1 (d1 v))
    (c2 (d2 u)))
  (
    (c3 (d3 u))
    (c4 (d4 w)))
  (
    (c5 (d5 Int)))
))
(declare-const a u)
(assert ((_ is c1) a))
(check-sat)
(get-model)

In fact we cannot find an appropriate order with the information we got in the module Ty. We need to pass all the nests at once during the parsing in order to perform a topological order on the nests:

type case = string * (string * t) list
type adt = string * case list * t list

let find_map (type c) =
  let exception Found of c
  in fun (p : 'a -> 'b -> c option) (h : ('a, 'b) Hashtbl.t) ->
    try
      Hashtbl.iter
        (fun k v ->
           match p k v with Some c -> raise @@ Found c | None -> ()
        ) h;
      raise Not_found
    with Found c -> c

let count_in_set (set : (string, case list * t list) Hashtbl.t) c =
  Lists.sum
    (fun (_, ty) ->
       match ty with
       | Tadt (name, _) when Hashtbl.mem set (Hstring.view name) -> 1
       | _ -> 0
    ) (snd c)

let cons_weight (set : (string, case list * t list) Hashtbl.t) c1 c2 =
  let c = count_in_set set c1 - count_in_set set c2 in
  if c <> 0 then c
  else
    List.compare_lengths (snd c1) (snd c2)

let topological_sort (adts : adt list) =
  let seens : adt list ref = ref [] in
  let set : (string, case list * t list) Hashtbl.t = Hashtbl.create 17 in
  List.iter (fun (name, cases, params) -> Hashtbl.add set name (cases, params)) adts;
  while Hashtbl.length set > 0 do
    let adt, (cases, params) =
      find_map (fun adt (cases, params) ->
          let b =
            List.exists (fun (_ , case) ->
                List.for_all (fun (_, ty) ->
                    match ty with
                    | Tadt (n, _) ->
                      not @@ Hashtbl.mem set (Hstring.view n)
                    | _ -> true
                  ) case
              ) cases
          in
          if b then Some (adt, (cases, params)) else None
        ) set
    in
    let cases = List.stable_sort (cons_weight set) cases in
    Hashtbl.remove set adt;
    seens := (adt, cases, params) :: !seens
  done;
  List.rev !seens

let declare_adts adts =
  let adts = topological_sort adts in
  List.iter (fun (name, cases, params) ->
      let name = Hstring.make name in
      let cases =
        List.map (fun (c, destrs) ->
          let destrs =
            List.map (fun (d, dty) -> Hstring.make d, dty) destrs
          in
          { constr = Hstring.make c; destrs }
        ) cases
      in
      Decls.add name params (Adt cases)
    ) adts

It seems to be okay. I have still some cleanup to perform before submitting the PR for reviews.

I think that we also have to produce unique identifiers instead of Hstring for constructors. As I explained to @bclement-ocp last week, I believed we could trigger a bug with some push/pop and ADT's declarations sharing constructor names. This bug isn't easy to trigger because after a pop statement, the Hstrings of the constructors of a dead ADT are dead values for OCaml's runtime and the GC may free them before the declaration of a new ADT using these constructor names.

Halbaroth commented 3 months ago

Finally, this PR is ready for a first review! I rewrote the algorithm for sorting constructors, see the new module Nest.

As I explained offline, it's not easy to avoid a global hash table to store the total order of constructors with the current state of the codebase. I plan to write a better code after removing the legacy frontend.

The PR have been tested on ae-format, UFDT and QF_DT and doesn't introduce regression or slow down the solver.

Halbaroth commented 3 months ago

There is a bug in the module Nest. I will investigate later.

Halbaroth commented 3 months ago

I fixed the bug.

Halbaroth commented 2 months ago

Waiting for an updated version to give formal approval; the Nest module looks OK. As discussed offline, if we can adapt the code to store priorities for types rather than constructors I think that would be better (especially w.r.t. polymorphism) but I would be OK with merging with the current algo.

I think I prefer to merge this early version and apply your suggestion in a separated PR. I will open an issue with example where the current algorithm fails to terminate.

Halbaroth commented 2 months ago

@bclement-ocp is it good for you?

OCamlPro / alt-ergo

feat(ADT): model generation #1093