leanprover / lean4

Lean 4 programming language and theorem prover
https://lean-lang.org
Apache License 2.0
4.69k stars 421 forks source link

modifying TC synth order can give 2% speedup of mathlib #5333

Closed jcommelin closed 1 month ago

jcommelin commented 1 month ago

The mathlib PR https://github.com/leanprover-community/mathlib4/pull/16646 gives a 2% speedup of mathlib, coming from a 6% speedup of TC synth. This is achieved by changing the TC synth order of certain instances, using mechanisms that were likely not intended as user-facing API, and deviating from the documented behaviour of TC synth. (The PR was merged into mathlib, but later reverted, in order to allow for more research on this topic.)

The following MWE displays behaviour that we believe is analogous to what happens with mathlib's algebraic hierarchy. (Code snippet adapted from https://leanprover.zulipchat.com/#narrow/stream/287929-mathlib4/topic/Ways.20to.20speed.20up.20Mathlib/near/469291144)

import Lean.Elab.Command
import Lean.Elab.MatchExpr
import Lean.Meta.Instances

-- # Example hierarchy

class A (n : Nat) where

instance [A n] : A n.succ where

class B [A 20050] where

class C [A 20000] extends B where

instance : A 20050 where

class D where

instance inst1 : D where
instance inst2 [B] : D where

-- # TC synth times out by default

/--
error: (deterministic) timeout at `typeclass`, maximum number of heartbeats (20000) has been reached
Use `set_option synthInstance.maxHeartbeats <num>` to set the limit.
Additional diagnostic information may be available using the `set_option diagnostics true` command.
-/
#guard_msgs in
#synth D

The beginning of the trace looks like

[Meta.synthInstance] 💥️ D ▼
  [] new goal D ▼
  [instances] #[inst1, @inst2] 
  [] ✅️ apply @inst2 to D ▼
  [tryResolve] ✅️ D ≟ D 
  [] new goal B ▼
  [instances] #[@C.toB] 
  [] ✅️ apply @C.toB to B ▼
  [tryResolve] ✅️ B ≟ B 
  [] new goal A 20000 ▼
  [instances] #[@instASucc] 
  [] ✅️ apply @instASucc to A 20000 ▼
  [tryResolve] ✅️ A 20000 ≟ A (Nat.succ 19999) 
  [] new goal A 19999 ▼
  [instances] #[@instASucc] 
  [] ✅️ apply @instASucc to A 19999 ▼
  [tryResolve] ✅️ A 19999 ≟ A (Nat.succ 19998) 
  [] new goal A 19998 ▶
  [] ✅️ apply @instASucc to A 19998 ▶
  [] ✅️ apply @instASucc to A 19997 ▶
<... snip>

When Lean applies C.toB it first attempts to find an instance for A 20000, and times out while doing so. So it never gets to look for an instance of C, which does not exist.

-- # Meta code to specify synth order

namespace Lean.Meta

/-- Specify the synthesization order for instances. -/
def Instances.setSynthOrder (declName : Name) (synthOrder : Array Nat) : CoreM Unit := do
  let some entry := Meta.instanceExtension.getState (← getEnv) |>.instanceNames.find? declName |
    throwError "'{declName}' does not have [instance] attribute"
  instanceExtension.add { entry with synthOrder } entry.attrKind

end Lean.Meta

namespace Lean.Elab.Command

open Meta

/-- Specify the synthesization order for instances. -/
elab "set_synth_order " name:ident synthOrder:term : command => do
  let q := Term.MatchExpr.toDoubleQuotedName name
  elabCommand (← `(command| run_meta Instances.setSynthOrder $q $synthOrder))

set_synth_order instAddNat #[]

end Lean.Elab.Command

-- # TC synth succeeds with a new order

set_synth_order C.toB #[1, 0]

/-- info: instD -/
#guard_msgs in
#synth D

Now the trace looks like

[Meta.synthInstance] ✅️ D ▼
  [] new goal D ▼
  [instances] #[inst1, @inst2] 
  [] ✅️ apply @inst2 to D ▼
  [tryResolve] ✅️ D ≟ D 
  [] new goal B ▼
  [instances] #[@C.toB] 
  [] ✅️ apply @C.toB to B ▼
  [tryResolve] ✅️ B ≟ B 
  [] no instances for C ▼
  [instances] #[] 
  [] ✅️ apply inst1 to D ▼
  [tryResolve] ✅️ D ≟ D 
  [answer] ✅️ D 
  [] result inst1 

where Lean immediately bails on C.toB and continues with inst1.

sgouezel commented 1 month ago

To add some information, the issue is that some typeclass searches are very costly in mathlib, typically the ones that are very generic and therefore lead to a gigantic search, while other instances are much more narrow and so much more likely to fail quickly when they fail (or to succeed quickly). The manual fixes that gave a 2% speedup in mathlib changed the search order for some instances, starting with the narrow ones instead of the generic ones.