Closed dwrensha closed 3 months ago
Thanks for the feedback! Most of the team is travelling (mostly at ICML) currently but will begin resolving this when we return.
Now having looked at the issue I can say that we've had a misconception of the semantics of (x : α := a)
. Thanks for pointing it out! It shouldn't be too difficult to modify this. Should we modify such definitions where applicable to be top-level, like:
def fqsat (f : ℝ → ℝ) (q : ℚ) : Prop := ContDiff ℝ 1 f ∧ ∃ p : ℚ, p = f q ∧ p.den = q.den
theorem putnam_2008_b5'
: ∀ f : (ℝ → ℝ), (∀ q : ℚ, fqsat f q) ↔ f ∈ putnam_2008_b5_solution :=
sorry
Or maybe
theorem putnam_2008_b5''
(fqsat : (ℝ → ℝ) → ℚ → Prop)
(hfqsat : ∀ f q, fqsat f q ↔ ContDiff ℝ 1 f ∧ (∃ p : ℚ, p = f q ∧ p.den = q.den))
: ∀ f : (ℝ → ℝ), (∀ q : ℚ, fqsat f q) ↔ f ∈ putnam_2008_b5_solution :=
sorry
is a better choice? It may be preferable to keep things contained within the theorem statement as existing ITP interfaces may not handle context outside theorem statement off-the-shelf (but this doesn't necessarily have to be catered to).
In my opinion, adding top-level definitions is more idiomatic. However, adding extra hypotheses is what AlphaProof does: https://storage.googleapis.com/deepmind-media/DeepMind.com/Blog/imo-2024-solutions/P6/index.html
theorem imo_2024_p6
(IsAquaesulian : (ℚ → ℚ) → Prop)
(IsAquaesulian_def : ∀ f, IsAquaesulian f ↔
∀ x y, f (x + f y) = f x + y ∨ f (f x + y) = x + f y) :
IsLeast {(c : ℤ) | ∀ f, IsAquaesulian f → {(f r + f (-r)) | (r : ℚ)}.Finite ∧
{(f r + f (-r)) | (r : ℚ)}.ncard ≤ c} 2 := by ...
I think we'll opt to follow the AlphaProof convention from that problem for now. It is probably a bit better for existing baselines to work without additional tweaks at the moment, though the formalizations will be less idiomatic. I'm open to changing it in the future though, do let me know if you have any other suggestions for the benchmark. Thanks for pointing this issue out! We'll eventually split the benchmark into (train-valid)/test and would be happy to hear your thoughts on a good way to split it up, especially if you were planning to add proofs to any of these problems to Compfiles.
Will keep the issue open until we have resolved all the affected problems.
All modifications should appear in #193
putnam_2008_b5 and many other problems specify predicates as default values for optional parameters. It seems that this was done to avoid the need to add extra top-level definitions. Unfortunately, it leads to false theorems, because within the
theorem
body, we are not allowed to assume that the optional parameter always has its default value.To show that
putnam_2008_b5
is false as currently stated, here is an example of how to use it to proveFalse
: