leanprover / lean4

Lean 4 programming language and theorem prover
https://lean-lang.org
Apache License 2.0
4.76k stars 428 forks source link

Exponential unfolding behaviour in Meta.isDefEq and Meta.whnf reduction #5596

Open ymherklotz opened 2 months ago

ymherklotz commented 2 months ago

Prerequisites

Please put an X between the brackets as you perform the following steps:

Description

In the following reduction with a definition of mergeWith that has been adapted from mergeWith from Batteries.RBSet onto List, there seems to an exponential number of reductions with respects to the size of the first or second list.

def _root_.List.mergeWith {α β} [BEq α] (t₁ t₂ : List (α × β)) : List (α × β) :=
  t₂.foldl (init := t₁) fun t₁ a₂ =>
    t₁.cons <| match t₁.find? (a₂.1 == ·.1) with | some a₁ => a₁ | none => a₂

/--
info: [reduction] unfolded declarations (max: 32752, num: 1):
  [reduction] Nat.rec ↦ 32752[reduction] unfolded reducible declarations (max: 32752, num: 1):
  [reduction] Nat.casesOn ↦ 32752use `set_option diagnostics.threshold <num>` to control threshold for reporting counters
-/
#guard_msgs(info) in
set_option diagnostics true in
set_option diagnostics.threshold 32000 in
example (T : Nat) : [(2, T)].mergeWith (List.replicate 13 (1, 0)) 
                    = (List.replicate 13 (1, 0)).concat (2, T) := by rfl

Increasing the size of the list to 14 in List.replicate will double the number of reductions of Nat.rec to ~64000 and it quickly becomes intractible. I feel like it's due to the same expression being reduced multiple times and not being cached/shared properly during the reduction.

One thing to note is that this is using a free variable T : Nat to force use of Meta.isDefEq instead of Kernel.isDefEq, because the latter doesn't seem to suffer from the same issue.

Context

We are using merging of Batteries.RBMap to build up a circuit from a description, and we were having trouble proving definitional equality using eq_refl. A link to a relevant discussion is here: Link to Zulip conversion. We then took the advice and modified eq_refl to use Kernel.isDefEq even with the presence of free variables, which seems to work well.

However, we are facing more issues because we are typing terms using elements of the map, and this leads to unwanted (and slow) reduction everywhere (with Meta.whnf this time), including in elaboration and delaboration.

Steps to Reproduce

Link to example

  1. Open link to example, and see that currently Nat.rec is unfolded ~32000 times.
  2. Change 13 to 14 in List.replicate and see that the number of unfolds doubles.

Expected behavior: I would expect a polynomial number of unfolds as the size of the list increases.

Actual behavior: Instead, the number of unfolds seems to be exponential in the size of the lists.

Versions

4.12.0-nightly-2024-10-02

Impact

Add :+1: to issues you consider important. If others are impacted by this issue, please ask them to add :+1: to it.

nomeata commented 1 month ago

Not a complete analysis, but I got curious, and there is definitely a lack of sharing: The foldl is duplicating a bunch of term that never get forced, in particular the t1, and it seems that this causes repeated evaluation.

The problem disappears if one makes the foldl explicit, and then hoists out the match:

def _root_.List.mergeWith {α β} [BEq α] (t₁ t₂ : List (α × β)) : List (α × β) := go t₁ t₂
  where 
    go t₁
    | [] => t₁
    | a₂::t₂ => 
      let k x := go (t₁.cons x) t₂
      match t₁.find? (a₂.1 == ·.1) with
      | some a₁ => k a₁
      | none => k a₂