InternLM / InternLM-Math

State-of-the-art bilingual open-sourced Math reasoning LLMs.
Apache License 2.0
423 stars 25 forks source link

Regarding the proofs (and theorems) in Lean-WorkBook #30

Closed realharryhero closed 4 months ago

realharryhero commented 4 months ago

The dataset is quite well-crafted, and I'm trying to verify the theorems and proofs with lean, by placing the theorems and proofs into a lean environment. So I extracted the proofs and theorems into the lean file, using this code (python, and can be exported as jupyter notebook) below. Note that lean_workbook is the file containing the lean workbook JSON file, and output_file is the file (need not be lean) that the code outputs the lean theorem to.

Extraction Code

```python # %% import numpy as np import pandas as pd #Change output file and lean_workbook file as necessary output_file = "/home/mfan/lean_docs_scrape_analysis/leandoc.txt" lean_workbook = "/home/mfan/lean_docs_scrape_analysis/lean_workbook.json" leandataset = pd.read_json(lean_workbook) # %% leandataset # %% largeproofdataset = leandataset[leandataset["proof"].map(len) > 1] # %% largeproofdataset # %% f = open(output_file, 'a', encoding="utf-8") i = 0 for index, row in largeproofdataset.iterrows(): if i >= 100: break print("\n", end="") f.write("\n") print(row["formal_statement"], end="") f.write(row["formal_statement"]) i += 1 # %% f.close() ```

This gives the following theorems:

Theorems

``` theorem lean_workbook_18 : (29 * 31 + 37 - 41) % 4 = 3 := by sorry theorem lean_workbook_26 (x : ℝ) (hx : 0 < x) : x - 1 ≥ Real.log x := by sorry theorem lean_workbook_29 (a b c : ℝ) : (a + c) ^ 2 + 3 * b ^ 2 ≥ 3 * a * b + 3 * b * c + a * c := by sorry theorem lean_workbook_31 (a b c : ℝ) : a^2 + b^2 + c^2 ≥ a * b + b * c + c * a := by sorry theorem lean_workbook_32 : 2 * 1^2 + 1 ≤ 3^1 := by sorry theorem lean_workbook_43 (a b : ℝ) : a ^ 4 + b ^ 4 + 2 ≥ 4 * a * b := by sorry theorem lean_workbook_96 (x y : ℝ) : |(abs x) - (abs y)| ≤ abs (x - y) := by sorry theorem lean_workbook_98 : 2^(0 + 1) = 2 := by sorry theorem lean_workbook_101 (x : ℕ) (hx : x = 2^9 + 1) : x = 513 := by sorry theorem lean_workbook_102 (p : ℝ) (h₀ : 8 * p^2 - 8 * p - 3 ≠ 0) (h₁ : 3 - (4 * p + 3) ≠ 0) : (3 - (4 * p + 3)) / (8 * p^2 - 8 * p - 3) = (-4 * p) / (8 * p^2 - 8 * p - 3) := by sorry theorem lean_workbook_136 (n : ℕ) : (4*n ≡ 4 [ZMOD 12] → n-1 ≡ 0 [ZMOD 3] → n ≡ 1 [ZMOD 3]) := by sorry theorem lean_workbook_139 (a b c d : ℤ) (ha : a > 0 ∧ b > 0 ∧ c > 0 ∧ d > 0) (habc : a + b + c + d = 0) : a + b + c + d = 0 := by sorry theorem lean_workbook_140 (x : ℝ) : x^2 + 4*x - 1 = (x + 2 + Real.sqrt 5) * (x + 2 - Real.sqrt 5) := by sorry theorem lean_workbook_141 : (5:ℝ)^(1/3) + Real.sqrt 2 < (4:ℝ)^(1/3) + Real.sqrt 3 := by sorry theorem lean_workbook_190 (x y z : ℝ) (hx : x ∈ Set.Icc 0 1) (hy : y ∈ Set.Icc 0 1) (hxy : x + y ≤ 1) (hz : z ≥ 1) : 8 * x ^ 2 * y ^ 2 + z * (x + y) ^ 2 ≥ 4 * x * y * z := by sorry theorem lean_workbook_211 (n : ℕ) (x : ℤ) : x ^ (4 * n) + x ^ (2 * n) + 1 = (x ^ (2 * n) + x ^ n + 1) * (x ^ (2 * n) - x ^ n + 1) := by sorry theorem lean_workbook_214 (x y : ℝ) : (x - y) ^ 2 ≥ 0 := by sorry theorem lean_workbook_242 (x y a : ℝ) : (3 * a + 2 * x + y) ^ 2 ≤ 9 * (a + x) * (a + x + y) ↔ 3 * a * (2 * x + y) + 5 * x ^ 2 + 5 * x * y ≥ y ^ 2 := by sorry theorem lean_workbook_248 (x y : ℝ) : x^5 - y^5 = (x - y) * (x^4 + x^3*y + x^2*y^2 + x*y^3 + y^4) := by sorry theorem lean_workbook_251 (x : ℝ) (h₀ : 9 - x^2 ≥ 0) : -3 ≤ x ∧ x ≤ 3 := by sorry theorem lean_workbook_261 : ∀ a b : ℝ, sin a * sin b = 1/2 * (cos (a - b) - cos (a + b)) := by sorry theorem lean_workbook_273 (a : ℝ) (h₀ : 1 ≤ a) : a^4 + a ≥ a^3 + 1 := by sorry theorem lean_workbook_282 (a b c : ℝ) : (a + b + c) ^ 2 ≥ 3 * (b * c + c * a + a * b) := by sorry theorem lean_workbook_283 {a b c : ℤ} (habc : a ≠ 0 ∧ b ≠ 0 ∧ c ≠ 0) (hab : a.gcd b = 1) (hbc : b.gcd c = 1) (hca : c.gcd a = 1) : ∃ k : ℤ, a.gcd (b + k * c) = 1 := by sorry theorem lean_workbook_302 : cos θ = cos θ := by sorry theorem lean_workbook_314 (n : ℕ) : ∑ i in Finset.range (n+1), choose n i = 2 ^ n := by sorry theorem lean_workbook_350 (n k : ℕ) (h₀ : 0 < k ∧ 0 < n) (h₁ : n ≥ k) : Nat.choose n (k - 1) = Nat.choose n (n - k + 1) := by sorry theorem lean_workbook_381 (y z : ℝ) : (y - z) ^ 2 * (2 * y - z) ^ 2 * (3 * y - z) ^ 2 ≥ 0 := by sorry theorem lean_workbook_407 : ∀ n:ℕ, 9 ∣ (10^n - 1) := by sorry theorem lean_workbook_436 : 7^7 < 2^20 := by sorry theorem lean_workbook_485 (x y : ℝ) (h : 0 < x ∧ 0 < y) (h2 : x^3 + y^3 = x - y) : x^2 - y^2 < 1 := by sorry theorem lean_workbook_490 : 3 ∣ c → 3^2 ∣ 3*c → 3 ∣ c := by sorry theorem lean_workbook_504 : 11 ^ 10 ≡ 1 [ZMOD 100] := by sorry theorem lean_workbook_505 : ∀ n : ℤ, 2 ∣ n^5 - n := by sorry theorem lean_workbook_508 (a b d x y : ℤ) (h₁ : d = gcd a b) : d ∣ a * x + b * y := by sorry theorem lean_workbook_513 : a ^ 2 + 1 ≡ 0 [ZMOD 3] → a ^ 2 ≡ -1 [ZMOD 3] := by sorry theorem lean_workbook_519 (a b c : ℝ) (ha : 0 < a) (hb : 0 < b) (hc : 0 < c) (hab : a + b + c = 1) (h : a^2 + b^2 + c^2 = 1) : (bc / (a - a^3) + ca / (b - b^3) + ab / (c - c^3)) ≥ 5 / 2 := by sorry theorem lean_workbook_525 (a b c : ℝ) (ha : 0 < a) (hb : 0 < b) (hc : 0 < c) (hab : a + b ≤ c) : a * b + b ^ 2 + c ^ 2 ≥ (7 / 9) * (a + 2 * b) * c := by sorry theorem lean_workbook_543 : ∀ p : ℕ, p.Prime → ∀ n : ℕ, p - 1 ∣ p ^ n - 1 := by sorry theorem lean_workbook_559 (a b : ℝ) : (1 + a^4) * (1 + b^4) ≥ (1 + a^2 * b^2) * (a^2 + b^2) := by sorry theorem lean_workbook_561 (a b c : ℝ) (ha : a ≠ b) (hb : b ≠ c) (hc : c ≠ a) (hab : a > b) (hbc : b > c) (hca : c > a) : (a * b + b * c + c * a) * (1 / (a - b) ^ 2 + 1 / (b - c) ^ 2 + 1 / (c - a) ^ 2) ≥ 4 := by sorry theorem lean_workbook_565 (x : ℕ) (h₀ : (4 * x) % 128 = 12) : x % 32 = 3 := by sorry theorem lean_workbook_570 (c : ℝ) (hc : 0 ≤ c) (f : ℝ → ℝ) (hf: ∀ x, f x = 1 / (c * x + 1)) : ∀ x y, (x > 0 ∧ y > 0) → f x * f (y * f x) = f (x + y) := by sorry theorem lean_workbook_588 (x : ℝ) : (x^6 - 1)^2 + (x^5 - x)^2 + (x^4 - x^2)^2 + 1 ≥ 0 := by sorry theorem lean_workbook_617 : 8 / (2 * 2) = 2 := by sorry theorem lean_workbook_618 (a b : ℤ) : a^4 + 4 * b^4 = (a^2 + 2 * b^2 + 2 * a * b) * (a^2 + 2 * b^2 - 2 * a * b) := by sorry theorem lean_workbook_673 (a b : ℝ) : 2 * (a ^ 2 + b ^ 2) ≥ (a + b) ^ 2 := by sorry theorem lean_workbook_676 (a : ℝ) (ha : 0 < a) : (a * (3 * a + 1)) / (a + 1) ^ 2 ≤ (3 / 4 : ℝ) * a + 1 / 4 := by sorry theorem lean_workbook_680 (a b c : ℝ) (ha : a > 0 ∧ b > 0 ∧ c > 0 ∧ a + b + c = 3) : a^4 + b^4 + c^4 ≥ a^3 + b^3 + c^3 := by sorry theorem lean_workbook_689 (x : ℝ) (h : x > 0) : (x + 1) ^ 2 ≥ 4 * x := by sorry theorem lean_workbook_694 (x : ℝ) (h : x + 600 = 1700) : x = 1100 := by sorry theorem lean_workbook_695 (D : Set ℝ) (f : ℝ → ℝ) (hD : IsCompact D) (hf : ContinuousOn f D) : IsCompact (Set.image f D) := by sorry theorem lean_workbook_698 : ((1 + 6) * (2 + 9) * (5 + 8) - (3 + 4) / 7 = 1000) := by sorry theorem lean_workbook_704 : ∑ k in Finset.range 3, (Nat.choose 5 (2 * k + 1)) = 16 := by sorry theorem lean_workbook_722 : ∀ x y z : ℝ, 16 * x ^ 2 + 25 * y ^ 2 + 36 * z ^ 2 ≥ 45 * y * z + 27 * z * x + 5 * x * y := by sorry theorem lean_workbook_742 : (-10:ℤ) + (-4:ℤ) + (2:ℤ) + (8:ℤ) + (14:ℤ) + (20:ℤ) + (26:ℤ) + (32:ℤ) + (38:ℤ) + (44:ℤ) = 170 := by sorry theorem lean_workbook_743 (a b c d : ℤ) : (a^2 + b^2) * (c^2 + d^2) = (a * d + b * c)^2 + (a * c - b * d)^2 := by sorry theorem lean_workbook_749 (m : ℕ) (hm : 0 < m) : ∃ n, m ∣ fib n := by sorry theorem lean_workbook_773 (a b c : ℝ) : (a^2 + 1) * (b^2 + 1) * (c^2 + 1) ≥ (a * b * c - 1)^2 := by sorry theorem lean_workbook_785 (x y : ℝ) (r : ℝ) (hr : r = Real.sqrt (x ^ 2 + y ^ 2)) (hp : -Real.pi < θ ∧ θ ≤ Real.pi) (htr : (x, y) = (r * Real.cos θ, r * Real.sin θ)) : (x, y) = (r * Real.cos θ, r * Real.sin θ) ∧ -Real.pi < θ ∧ θ ≤ Real.pi := by sorry theorem lean_workbook_790 (n : ℕ) (u : ℕ → ℕ) (h₀ : 1 ≤ n) (h₁ : ∀ n, u n ≠ 0) (h₂ : ∑ k in Finset.Icc 1 (n + 1), (1 : ℝ) / (u k * u (k + 1)) = n / (u 1 * u (n + 1))) : ∑ k in Finset.Icc 1 (n + 1), (1 : ℝ) / (u k * u (k + 1)) + (1 : ℝ) / (u (n + 1) * u (n + 2)) = n / (u 1 * u (n + 1)) + (1 : ℝ) / (u (n + 1) * u (n + 2)) := by sorry theorem lean_workbook_795 (a b c : ℝ) : (a + b + c) ^ 3 - a ^ 3 - b ^ 3 - c ^ 3 = 3 * (a + b) * (b + c) * (c + a) := by sorry theorem lean_workbook_811 : ∀ x : ℝ, x^4 + x^2 + 1 = (x^2 - x + 1) * (x^2 + x + 1) := by sorry theorem lean_workbook_837 :∀ a b c x y z p q r : ℝ, p = a^2 + b^2 + c^2 ∧ q = a * b + b * c + c * a ∧ r = (b - a) * (b - c) ∧ x = a^2 + 2 * b * c ∧ y = b^2 + 2 * c * a ∧ z = c^2 + 2 * a * b → x * z = p * (q - r) + r^2 := by sorry theorem lean_workbook_847 (x y : ℝ) : 2 * (x^2 + y^2) ≥ (x + y)^2 := by sorry theorem lean_workbook_849 : Real.cos (π / 2) = 0 := by sorry theorem lean_workbook_874 (a b : ℝ) (hab : a > -1 ∧ b > -1)(h : a^3 + b^3 >= a^2 + b^2) : a^5 + b^5 >= a^2 + b^2 := by sorry theorem lean_workbook_942 : 1 - (25 : ℝ) / 64 = 39 / 64 := by sorry theorem lean_workbook_943 : (1 ^ 251 * 1 * 3) % 8 = 3 := by sorry theorem lean_workbook_944 ∑ k in (Finset.range 10), (2^k) = 2^10 - 1 := by sorry theorem lean_workbook_946 (c d : ℝ) (h₁ : c + d = 7) (h₂ : c * d = 9) : c^3 + d^3 = 154 := by sorry theorem lean_workbook_952 (a : ℝ) : sin (a / 2) * (sin (a / 2) - 1) ≥ -1 / 4 := by sorry theorem lean_workbook_953 : ∀ x y : ℝ, Real.sqrt ((x + 2) ^ 2 + (y + 2) ^ 2) ≥ (x + 2 + y + 2) / Real.sqrt 2 := by sorry theorem lean_workbook_954 (x y z : ℝ) (h : x*y^2 + y*z^2 + z*x^2 = 3) (hx : 0 < x) (hy : 0 < y) (hz : 0 < z) : (x + 7)^(1/3) + (y + 7)^(1/3) + (z + 7)^(1/3) ≤ 6 := by sorry theorem lean_workbook_978 (x y z : ℝ) (h : x + y + z = 3) : x * y + x * z + y * z ≤ 3 := by sorry theorem lean_workbook_982 (x y z: ℝ) : x ^ 2 + y ^ 2 + z ^ 2 ≥ x * y + x * z + y * z := by sorry theorem lean_workbook_987 {a b c d e : ℝ} (h1 : 0 ≤ a ∧ 0 ≤ b ∧ 0 ≤ c ∧ 0 ≤ d ∧ 0 ≤ e) (h2 : a ≤ b ∧ b ≤ c ∧ c ≤ d ∧ d ≤ e) : d + e ≥ c + e ∧ c + e ≥ d + b ∧ d + b ≥ a + c ∧ a + c ≥ a + b ∧ a + b ≥ 0 := by sorry theorem lean_workbook_1001 {a b c d : ℂ} : (a^2 + b^2) * (c^2 + d^2) = (a * d - b * c)^2 + (a * c + b * d)^2 := by sorry theorem lean_workbook_1002 (y : ℝ) (hy : -1 / 2 < y ∧ y < 0) : |y| > |2*y^2| := by sorry theorem lean_workbook_1003 (x y: ℝ): x ^ 2 + x + y ^ 2 + y + 1 ≥ x * y := by sorry theorem lean_workbook_1008 : ∀ x y : ℝ, Real.cos x - Real.cos y = -2 * Real.sin ((x + y) / 2) * Real.sin ((x - y) / 2) := by sorry theorem lean_workbook_1018 (a b c d : ℝ) : 3 * (a^2 + b^2 + c^2 + d^2) - 4 * (a * b + b * c + c * d + d * a) + 2 * (a * c + b * d) ≥ 0 := by sorry theorem lean_workbook_1021 (r : ℝ) (n : ℕ) : ∃ f : ℕ → ℝ, f 1 = r ∧ ∀ k, f k = (2 : ℝ)^(k-1) * f 1 := by sorry theorem lean_workbook_1024 (x : ℝ) (hx : ∃ z : ℤ, x = z) : (Int.floor x)^3 + 2 * x^2 = x^3 + 2 * (Int.floor x)^2 := by sorry theorem lean_workbook_1032 (a b c : ℝ) : (5 / 3) * (a ^ 2 + b ^ 2 + c ^ 2) ^ 2 ≥ 2 * (a ^ 3 * b + b ^ 3 * c + c ^ 3 * a) + 3 * (a * b ^ 3 + b * c ^ 3 + c * a ^ 3) := by sorry theorem lean_workbook_1039 (x : ℝ) (hx : x = (4 - 2 * Real.sqrt 3) / 2) : x = 2 - Real.sqrt 3 := by sorry theorem lean_workbook_1041 : 13 ∣ 2^30 + 3^60 := by sorry theorem lean_workbook_1043 : 2 / 5 * (x + 1) ^ (5 / 2) - 2 / 3 * (x + 1) ^ (3 / 2) = 2 / 3 * x * (x + 1) ^ (3 / 2) - 4 / 15 * (x + 1) ^ (5 / 2) := by sorry theorem lean_workbook_1046 : ∀ a b c : ℝ, (2 * a ^ 2 - c ^ 2) ^ 2 + (2 * b ^ 2 - c ^ 2) ^ 2 ≥ 0 := by sorry theorem lean_workbook_1048 (p : ℕ) (hp : p.Prime) (a : ZMod p) (ha : a ≠ 0) : ∃ b : ZMod p, a * b = 1 := by sorry theorem lean_workbook_1062 (R r : ℝ) : 9 * R ^ 2 - 20 * R * r + 31 * r ^ 2 ≥ 16 * R * r - 5 * r ^ 2 := by sorry theorem lean_workbook_1071 (n : ℕ) : 1998 = 2 * 3^3 * 37 := by sorry theorem lean_workbook_1082 : 2 * Real.sin a * Real.cos b = Real.sin (a + b) + Real.sin (a - b) := by sorry theorem lean_workbook_1096 (a b c : ℝ) : a * b + b * c + c * a - a * b * c ≤ (c^2 + 1) / 2 + (a^2 + b^2) / 2 ↔ a^2 + b^2 + c^2 + 2 * a * b * c + 1 ≥ 2 * (a * b + b * c + c * a) := by sorry theorem lean_workbook_1105 (n : ℕ) (hn : 0 < n) (x_n : ℝ) (hx_n : x_n = (3 + Real.sqrt 5)^n + (3 - Real.sqrt 5)^n) : 2^n ∣ x_n := by sorry theorem lean_workbook_1116 (a b c : ℝ) : a^4 + b^4 + c^4 ≥ a^2 * b^2 + b^2 * c^2 + c^2 * a^2 := by sorry theorem lean_workbook_1122 (n : ℕ) : choose (n + 1) 2 + n + 1 = choose (n + 2) 2 := by sorry theorem lean_workbook_1130 : √((-1 : ℝ) ^ 2) = 1 := by sorry theorem lean_workbook_1155 {a b : ℕ} : Nat.lcm a b = a * b / Nat.gcd a b := by sorry theorem lean_workbook_1160 (a b c d : ℝ) (ha : a + 4 * b + 9 * c + 16 * d = 25) (hb : 4 * a + 9 * b + 16 * c + 25 * d = 36) (hc : 9 * a + 16 * b + 25 * c + 36 * d = 49) : 16 * a + 25 * b + 36 * c + 49 * d = 64 := by sorry ```

The compiler notices some issues in the functions, like choose not being Nat.choose in this theorem:

theorem lean_workbook_1122 (n : ℕ) : choose (n + 1) 2 + n + 1 = choose (n + 2) 2 := by sorry

and sin and cos not being Real.sin and Real.cos, respectively, in this theorem:

theorem lean_workbook_261 : ∀ a b : ℝ, sin a * sin b = 1/2 * (cos (a - b) - cos (a + b)) := by sorry

Secondly, and my main issue, is that proof statements are repeated many times, even when the goal has already been proven. To utilize the proofs, I used this following code:

Code

```python # %% import numpy as np import pandas as pd #Change output file and lean_workbook file as necessary output_file = "/home/mfan/lean_docs_scrape_analysis/leandoc.txt" lean_workbook = "/home/mfan/lean_docs_scrape_analysis/lean_workbook.json" leandataset = pd.read_json(lean_workbook) # %% leandataset # %% largeproofdataset = leandataset[leandataset["proof"].map(len) > 1] # %% largeproofdataset # %% f = open(output_file, 'a', encoding="utf-8") i = 0 for index, row in largeproofdataset.iterrows(): if i >= 100: break print("\n", end="") f.write("\n") print(row["formal_statement"], end="") f.write(row["formal_statement"][:(-5)]) for element in row["proof"]: print("\n " + element, end="") f.write("\n " + element) i += 1 ```

Which is very similar to the previous code, but I removed the "sorry" after the theorem and added these three lines near the end:

for element in row["proof"]:
    print("\n  " + element, end="")
    f.write("\n  " + element)

This gives many theorems (though not with the correct indent, so I hand-added a few indents for some of the code), giving the first theorem below:

theorem lean_workbook_18 :
  (29 * 31 + 37 - 41) % 4 = 3  :=  by
  simp [Nat.mod_lt]
  simp only [Nat.add_mod_left, Nat.mod_eq_zero_of_dvd, dvd_pow]
  simp only [Nat.gcd_add_self_right]
  simp only [Nat.add_mod, Nat.mul_mod, Nat.mod_mod]
  simp only [Nat.mod_add_div]
  exact (by norm_num : (29 * 31 + 37 - 41) % 4 = 3)
  simp only [Nat.mul_comm, Nat.mul_assoc, Nat.mul_left_comm]
  norm_num [Int.add_emod]
  simp only [Nat.mul_mod_right]
  simp only [add_comm]
  simp only [Nat.gcd]
  simp only [Nat.mod_eq_zero_of_dvd, dvd_pow]
  simp only [Nat.mul_mod, Nat.mod_mod]
  simp [Mod.mod]
  simp only [Nat.add_comm, Nat.add_left_comm, Nat.add_assoc, Nat.mul_comm, Nat.mul_left_comm, Nat.mul_assoc, Nat.sub_sub]
  simp only [Nat.add_comm, Nat.add_left_comm]
  simp only [Nat.add_sub_assoc, Nat.mod_self]
  simp only [mod_eq_sub_mod]
  norm_num [Nat.mod_eq_of_lt]
  simp only [Nat.add_mod, Nat.mul_mod, Nat.mod_mod, Nat.mod_eq_zero_of_dvd]
  exact Nat.mod_mod 3 4
  norm_num [Nat.mul_mod, Nat.add_mod, Nat.mod_mod]
  simp only [Nat.mod_mod]
  exact rfl
  simp only [Nat.add_sub_assoc]

But, as we see in this recording, only a few lines (only one proof line!) is needed.

https://github.com/InternLM/InternLM-Math/assets/118628756/9c9f9b75-05da-44e2-b439-45328aeca4c1

Something similar happens for multiple other theorems. Thank you for your time!

objecti0n commented 4 months ago

This is the default header we used:

import Mathlib.Algebra.Algebra.Basic
import Mathlib.Algebra.Order.Floor
import Mathlib.Algebra.Associated
import Mathlib.Algebra.BigOperators.Basic

import Mathlib.Algebra.BigOperators.Pi
import Mathlib.Algebra.GeomSum
import Mathlib.Algebra.Group.Pi.Basic
import Mathlib.Algebra.Group.Pi.Lemmas
import Mathlib.Algebra.Group.Commute.Basic
import Mathlib.Algebra.Group.Commute.Defs
import Mathlib.Algebra.Group.Commute.Hom
import Mathlib.Algebra.Group.Commute.Units
import Mathlib.Algebra.GroupPower.Basic
import Mathlib.Algebra.GroupPower.Identities
import Mathlib.Algebra.Order.Floor
import Mathlib.Algebra.QuadraticDiscriminant
import Mathlib.Algebra.Ring.Basic
import Mathlib.Analysis.MeanInequalitiesPow
import Mathlib.Analysis.Asymptotics.AsymptoticEquivalent
import Mathlib.Analysis.NormedSpace.Basic
import Mathlib.Analysis.SpecialFunctions.Log.Basic
import Mathlib.Analysis.SpecialFunctions.Log.Base
import Mathlib.Combinatorics.SimpleGraph.Basic
import Mathlib.Data.Complex.Basic
import Mathlib.Data.Complex.Exponential
import Mathlib.Data.Finset.Basic
import Mathlib.Data.Fintype.Card

import Mathlib.Data.Int.GCD
import Mathlib.Data.Int.ModEq
import Mathlib.Data.Int.Parity
import Mathlib.Data.List.Intervals
import Mathlib.Data.List.Palindrome
import Mathlib.Data.Multiset.Basic

import Mathlib.Data.Nat.Choose.Basic
import Mathlib.Data.Nat.Digits
import Mathlib.Data.Nat.Factorial.Basic
import Mathlib.Data.Nat.Fib.Basic
import Mathlib.Data.Nat.Fib.Zeckendorf
import Mathlib.Data.Nat.ModEq
import Mathlib.Data.Nat.Multiplicity
import Mathlib.Data.Nat.Parity
import Mathlib.Data.Nat.Prime
import Mathlib.Data.PNat.Basic
import Mathlib.Data.PNat.Prime

import Mathlib.Data.Rat.BigOperators
import Mathlib.Data.Rat.Cast.CharZero
import Mathlib.Data.Rat.Cast.Defs
import Mathlib.Data.Rat.Cast.Lemmas
import Mathlib.Data.Rat.Cast.Order
import Mathlib.Data.Rat.Defs
import Mathlib.Data.Rat.Denumerable
import Mathlib.Data.Rat.Encodable
import Mathlib.Data.Rat.Field
import Mathlib.Data.Rat.Floor
import Mathlib.Data.Rat.Init
import Mathlib.Data.Rat.Lemmas
import Mathlib.Data.Rat.Order
import Mathlib.Data.Rat.Sqrt

import Mathlib.Data.Rat.Star
import Mathlib.Data.Real.Basic
import Mathlib.Data.ENNReal.Basic
import Mathlib.Data.ENNReal.Inv
import Mathlib.Data.ENNReal.Operations
import Mathlib.Data.ENNReal.Real
import Mathlib.Data.Real.ENatENNReal
import Mathlib.Data.Real.Irrational
import Mathlib.Data.Real.NNReal
import Mathlib.Data.Real.Sqrt

import Mathlib.Data.Set.Finite
import Mathlib.Data.Sym.Sym2
import Mathlib.Data.ZMod.Basic
import Mathlib.Dynamics.FixedPoints.Basic

import Mathlib.LinearAlgebra.AffineSpace.AffineMap
import Mathlib.LinearAlgebra.AffineSpace.Independent
import Mathlib.LinearAlgebra.AffineSpace.Ordered
import Mathlib.LinearAlgebra.FiniteDimensional
import Mathlib.Logic.Equiv.Basic

import Mathlib.Order.Filter.Basic

import Mathlib.Order.WellFounded
import Mathlib.Topology.Basic
import Mathlib.Topology.Instances.NNReal

import Mathlib.Data.ZMod.Basic
import Mathlib.RingTheory.Int.Basic

import Aesop
import ProofWidgets
open BigOperators

open Nat

open Real

open Rat

For the second problem, I think you misunderstand our dataset. We provide multiple possible proofs. Each line is one kind of proof.

realharryhero commented 4 months ago

That explains the \n s in each list item; thank you very much!

hsz0403 commented 3 months ago

I have a question here, I see almost all theorems are ended by sorry, then how does this lean code in dataset improves the model's performance? Or does this dataset is used to just train to translate natural language questions to lean statements without proofs?

objecti0n commented 3 months ago

I have a question here, I see almost all theorems are ended by sorry, then how does this lean code in dataset improves the model's performance? Or does this dataset is used to just train to translate natural language questions to lean statements without proofs?

    {
        "natural_language_statement": "Prove $a^2+b^2+c^2\\ge ab+bc+ca$ using the Trivial Inequality $(a-b)^2+(b-c)^2+(c-a)^2\\ge 0$.",
        "answer": "",
        "tags": [
            "inequality",
            "number_theory",
            "algebra"
        ],
        "formal_statement": "theorem lean_workbook_31 (a b c : ℝ) : a^2 + b^2 + c^2 ≥ a * b + b * c + c * a  :=  by sorry",
        "split": "lean_workbook",
        "proof": [
            "ring_nf\nlinarith [sq_nonneg (a - b), sq_nonneg (b - c), sq_nonneg (c - a)]",
            "simp [sq, sub_eq_add_neg, add_assoc, add_left_comm, add_comm, mul_comm, mul_assoc, mul_left_comm]\nlinarith [sq_nonneg (a - b), sq_nonneg (b - c), sq_nonneg (c - a)]",
            "simp [sq]\nnlinarith [sq_nonneg (a - b), sq_nonneg (b - c)]",
            "simp [sq, sub_eq_add_neg]\nlinarith [sq_nonneg (a - b), sq_nonneg (b - c), sq_nonneg (c - a)]",
            "have h₁ : 0 ≤ (a - b)^2 + (b - c)^2 + (c - a)^2 := by nlinarith\nlinarith",
            "have : 0 ≤ (a - b)^2 + (b - c)^2 + (c - a)^2 := by nlinarith\nlinarith",
            "have h₁ : 0 ≤ (a - b)^2 + (b - c)^2 + (c - a)^2 := by positivity\nlinarith",
            "nlinarith [pow_two_nonneg (a - b), pow_two_nonneg (b - c), pow_two_nonneg (c - a)]",
            "have h₁ := sq_nonneg (a - b)\nhave h₂ := sq_nonneg (b - c)\nhave h₃ := sq_nonneg (c - a)\nlinarith",
            "linarith [sq_nonneg (a - b), sq_nonneg (b - c), sq_nonneg (c - a)]",
            "simp [add_comm]\nnlinarith [sq_nonneg (a - b), sq_nonneg (b - c), sq_nonneg (a - c)]",
            "ring_nf\nsimp [sq]\nlinarith [sq_nonneg (a - b), sq_nonneg (b - c), sq_nonneg (c - a)]",
            "rw [sq]\nnlinarith [sq_nonneg (a - b), sq_nonneg (b - c), sq_nonneg (c - a)]",
            "simp [sq]\nlinarith [sq_nonneg (a - b), sq_nonneg (b - c), sq_nonneg (a - c)]",
            "have h1 : 0 ≤ (a - b)^2 + (b - c)^2 + (a - c)^2 := by positivity\nlinarith",
            "ring_nf\nnlinarith [sq_nonneg (a - b), sq_nonneg (b - c), sq_nonneg (a - c)]",
            "have := sq_nonneg (a - b + c)\nlinarith [sq_nonneg (a - b), sq_nonneg (b - c), sq_nonneg (c - a)]",
            "have h0 := sq_nonneg (a - b)\nhave h1 := sq_nonneg (b - c)\nhave h2 := sq_nonneg (a - c)\nlinarith [h0, h1, h2]",
            "simp [sq, mul_comm, mul_assoc, mul_left_comm]\nlinarith [sq_nonneg (a - b), sq_nonneg (b - c), sq_nonneg (c - a)]",
            "linarith [sq_nonneg (a - b), sq_nonneg (b - c), sq_nonneg (a - c)]",
            "have : (a - b) ^ 2 + (b - c) ^ 2 + (c - a) ^ 2 ≥ 0 := by nlinarith\nlinarith",
            "simp [sq, add_comm, add_left_comm, add_assoc]\nlinarith [sq_nonneg (a - b), sq_nonneg (b - c), sq_nonneg (c - a)]",
            "have h : 0 ≤ (a - b)^2 + (b - c)^2 + (c - a)^2 := by nlinarith\nlinarith"
        ]
    }

Proofs are in proof.

hsz0403 commented 3 months ago

no, in your dataset , most of proofs (127391 of 140214) are none. So how you use the dataset for improving math ability?

objecti0n commented 3 months ago

no, in your dataset , most of proofs (127391 of 140214) are none. So how you use the dataset for improving math ability?

Yes, we have provided approximately 10% of proof in this dataset. This amount is much larger than any existing formalized math problem dataset. For left of them, our model has not found proof yet. And actually, for most RL method, they use expert iteration to improve formal performance, using problem only can also help them a lot.

hsz0403 commented 3 months ago

Thanks a lot! you said

our model has not found proof yet

Your paper said that proofs are searched by InternLM-Math-Plus(I hope I understand it right), but looks like your only tried best first search in your code. Have you tried other search methods?

Another question I have is, you use Qwen-1.5-14B-Chat to extract data and do NLI. What is the translation model that be finetuned?

objecti0n commented 3 months ago

Thanks a lot! you said

our model has not found proof yet

Your paper said that proofs are searched by InternLM-Math-Plus(I hope I understand it right), but looks like your only tried best first search in your code. Have you tried other search methods?

Another question I have is, you use Qwen-1.5-14B-Chat to extract data and do NLI. What is the translation model that be finetuned?

In Lean-workbook, we use whole proof search by our model like Deepseek-prover. We do not release our whole proof search code since we are still improving the whole proof ability. Translation model is fine-tuned on InternLM-Math-Plus-20B.

hsz0403 commented 3 months ago

Thanks a lot! so in Lean-workbook, you use whole proof search by your model InternLM-Math-Plus-20B or MoE? Can you release the code for your current lean-workbook search results? So that we can also improve the whole proof ability.

objecti0n commented 3 months ago

Thanks a lot! so in Lean-workbook, you use whole proof search by your model InternLM-Math-Plus-20B or MoE? Can you release the code for your current lean-workbook search results? So that we can also improve the whole proof ability.

We use fine-tuned InternLM-Math-Plus-20B for whole proof searching, our whole proof search model is not ready for published yet. An alternative solution is using proof from our provided Lean-Workbook to train a whole proof search model for now.