leanprover / lean4

Lean 4 programming language and theorem prover
https://lean-lang.org
Apache License 2.0
4.74k stars 427 forks source link

RFC: Visual Separators in Number Literals #6199

Open Command-Master opened 1 day ago

Command-Master commented 1 day ago

Proposal

Many languages (Java, Python, Perl, Ruby, Julia, Ada, JavaScript, PHP, D, Elixir, C#, C++, Rust, Haskell etc.) support using some character as a visual separator in number literals, mostly _, except C++ which uses '. This suggestion is for this behavior in Lean.

Community Feedback

Zulip discussion

Impact

Add :+1: to issues you consider important. If others benefit from the changes in this proposal being added, please ask them to add :+1: to it.

kmill commented 1 day ago

I looked at a few languages for their lexical syntaxes, and there are broadly two conventions that use _. First, the "Ada convention" (used by Ada, Python, and Javascript) is that the sequence digits in the literal have the following lexical syntax:

numeral := digit many("_"? digit)

Second, the "Rust convention" (used by Rust, Java, and C#):

numeral := digit many(digit <|> "_")

(For other bases, Rust allows _ at the beginning, like 0b_11_00.) Java and C# require that the numeral end in a digit.

I don't have much of a preference over whether it should be allowed to have sequences of _'s, but I think we should adopt a lexical syntax that satisfies the following properties:

kmill commented 23 hours ago

I made a reference implementation of the RFC at #6204 and I'll leave it to the triage team to decide whether to accept it.

edwardzcn-decade commented 21 hours ago

Add lexical structure information for similar proof tools.

Coq: Uses "Rust conventions" and allows continuing underscores. Isabelle: I couldn’t find a similar lexical allowance for separators. This requires further verification.