Open matil019 opened 4 years ago
This would probably be a good thing, except for the extra indirections that are introduced. But I guess the refactoring is a mess.
Thank you.
The indirection degrades the performance a little. Comparing the running time of make clean compiler1
with fregec.jar
of master
and that of the refactored one, the latter took 0.9% more time on my machine. I hope the additional safety provided by this change enables more aggressive optimization (by programmers) for overall better performance.
The refactoring is indeed huge, but fixing compilation errors, the most trivial and tedious part, is already done. I tried to minimize the diff by changing only where needed. I'm examining remaining pieces of code that requires understanding of semantics.
This is a discussion about the internal of the compiler, not the syntax of the language.
Currently, the compiler is implemented by heavy uses of a combination of a sum type and record syntax, or so-called sum-record types, which are in the form of:
the examples are
DefinitionS
,SymbolT
andQName
.Sum-records are unsafe constructs because it encourages writing partial functions.
SumRec.bar
in the example above is partial because it throws a runtime error if a value ofSumRec
is not constructed withBar
.Frege comes with a handy syntax
SumRec.{bar?}
which queries the existence of the field and returns aBool
. However, I find that it makes difficult to manage the codebase because it introduces an "implicit contract"; the existence of a particular field, which is invisible to its type so not checkable by a compiler, gains some meaning.I stumbled across this problem when I (quite a while ago) tried adding type families to Frege. I figured that I should add two data constructors
SymTF
for a type family andSymTI
for a type instance toSymbolT
but ended up getting a lot of runtime errors by accessing missing fields.The compiler has a lot of unchecked access to partial fields and redundant error cases (see the links for examples). Currently the compiler doesn't trigger any runtime errors, which are good thing, and I believe the author(s) at the time of writing were very well aware that erroneous field accesses never happened. But that fact is not visible in the source code and becomes a problem if someone (including me) who doesn't know those invisible contracts tries to modify the code, he or she can unknowingly break the compiler by introducing runtime errors.
It's true that one can infer the possible constructors by reading through related parts of the codebase, but it shouldn't be necessary. It's especially hard to track a pair of
enter
/changeSym
andGlobal.findit
. With the help of the type system of Frege, much of (if not all) that effort would be eliminated.So I intend to refactor the compiler not to use sum-record types.
SymbolT
for example will be written to:Thanks to Frege's
protected
constructors, the sum type constructors can have concise names.Please see my branches (no-sum-definitions, no-sum-symbolt) for the current refactoring work. The code is deliberately written in a way so that compilation warnings for unsafe field acceses are emitted as many as possible.