Open xrchz opened 7 years ago
Maybe #666 will help?
Currently, my plan with #666 is to do the following w.r.t. the HOL string type in the lexer:
Thus this does not avoid the explode
on input to the lexer. I agree that the explode
is untidy.
Ideally, we would phrase the lexer as a function that is handed to a fold-function over a finite character stream. The stateful implementation of the fold would then read characters lazily and the implementation would avoid producing the giant string containing all of the input. I guess that's what #645 is aiming for.
You could define a type 'a finiteUnfolder
, that was a subset of the type
'a -> (char # 'a) option
such that LFINITE (LUNFOLD f s)
held for all s
.
I.e., the wellformedness predicate would be
wfUF (f : 'a -> (char # 'a) option) <=> !a. LFINITE (LUNFOLD f a)
It'd be easy to show that a "string reader" which used the index as the state was such a thing. The finiteness guarantee in the type would mean that the lexer could be parameterised by an arbitrary unfolder. The question is still how you'd guarantee that an input file being read through TextIO
would necessarily be finite. (The completely ad hoc approach of cutting off past a limit of 100 MB would probably be fine: i.e., you could prove something like wfUF (limiter n f)
where n
is an arbitrary number and limiter
is effectively just LTAKE
.)
The CakeML lexer operates on a
char list
. Therefore the first thing the CakeML compiler does after reading in its input string isexplode
the string into a list. This is wasteful: I think the lexer can easily be made to work onmlstring
s.This issue is to make the lexer operate on
mlstring
s and update the compiler accordingly to avoid theexplode
.Extension thoughts:
sexp
-parsing option would still need toexplode
the input because thesexp
parser operates onchar list
s in a possibly more unavoidable way. (This issue does not require avoiding theexplode
forsexp
s.)