Closed ahelwer closed 1 month ago
There are terabytes of program code posted on the web. It would be useful to do an unbiased sampling of it and see what fraction uses non-ascii characters.
Leslie
For the sake of completeness: Adding Unicode support in strings requires minor changes in the fingerprinting code of strings. See https://github.com/tlaplus/tlaplus/pull/685#discussion_r742118597
Spoke with some Chinese and Japanese language users who said they don't often use their language's characters in function/variable names for whatever reason, although it would be nice to support unicode strings for printing error messages with assert, for example.
There are terabytes of program code posted on the web. It would be useful to do an unbiased sampling of it and see what fraction uses non-ascii characters. Leslie
In modern theorem provers and dependently typed languages, unicode math and symbol characters are extensively used. It is much cleaner to read.
This has been completed, although extending the set of valid identifiers to Greek characters was not done. Symbols for the Nat, Int, and Real sets were added however.
TLA+ Unicode Support Proposal
Motivation
TLA+ specifications can be translated into a "pretty-printed" form with LaTeX, but this is not how developers experience them when writing a spec. Within the past decade, UTF-8 has become so widely supported that any program limited to ASCII can be seen as deficient. Supporting Unicode in TLA+ provides two main benefits:
Proposed Changes
id == ...
\name
symbols in identifiers, for example to indicate Greek alphabet charactersChallenges
Required Work
⇸
and⥅
for-+->
)\sigma
andσ
)Prior Work & Discussion