Closed DavePearce closed 6 years ago
An interesting question is what the complete set of permitted coercions looks like.
Current list of outstanding issues:
Current problems I'm pondering:
Nominal Types. Getting a StackOverflowException
on lots of tests ... why?
Runtime Type tests. These need to be fleshed out (any, open records, constrained types). This is probably causing most tests to fail at the moment.
Back propagation. This is necessary to determine where implicit coercions are required. At the same time, it's causing some complexity due to the presence of intersection types.
Ideas:
any
type. What is this needed for anyway? That would only leave open records as infinitely sized types. It would also majorly help the C backend.
These thoughts come as I'm knee deep in redeveloping both the Java and JavaScript backends.
Problem Statement
The basic issue is that there are a lot of repeating problems across the different backends for Whiley. In principle, I don't mind some repetition. But, some of these problems are complex and really deserve to be solved once properly. A rough breakdown of the various components required by a backend is:
Statements and Expressions. In general, the translation of statements and expressions is straightforward and usually platform dependent. There's not much we can do here.
Data Representation. The representation of data is also relatively platform dependent. For example, the range of fixed-width integers supported. Likewise, whether we have true structs (JavaScript, C) or are stuck with classes and objects (Java). Some of this has implications for the treatment of value semantics (see below).
Runtime Coecions. The point at which coercions must actually be inserted is, in principle, a platform dependent issue. However, identifying when there is a change of representation is platform independent. For example, changing between different fixed-with integer types. Or, changing from a readable array type to an actual physical type. Also, deciding how to implement a coercion is a complex issue.
Runtime Type Tests. These are mostly platform independent and are a big source of complexity in backends. The platform determines whether any runtime type information is available and, if so, what the primitive operations are (i.e. tests for atomic types, such as
int
). In some platforms (e.g. JavaScript), there is always atomic type information available. In others (e.g. Java), there is sometimes atomic type information available (i.e. not for primitives). Finally, yet others (i.e. C) have no information available. A secondary issue is the potential for using type tags to handle finite types. This requires complex reasoning about types. For example, what is the fininte set of tags for(int|null)&!nat
? See RFC for more.Runtime Assertion Checks. The issues here are platform independent because they can generally be expressed using simple imperative constructs. For example, introducing shadow variables is done in a pretty similar fashion across backends (copy into dummy variables). Likewise, the insertion of type invariant tests is pretty generic (i.e. insert
assert
statements and/or additional helper methods).Value Semantics. The main issue here is where and when we need to clone values. Some of this decision process is platform dependent. For example, which things actually need to be copied (i.e. which are references, which are immutable, etc). However, much of this is not and is already encoded in
MoveAnalysis
.Function/Method Overloading. The main issue here is that no backend will support overloading out-of-the-box. We need some mechanism for type mangling and this is likely to be the same across platforms.
Multiple Returns. The handling of multiple returns is a relatively minor, though commonly occurring, problem. The general is either to use an array (Java, JavaScript) or a struct (C).
Comment Preservation. The general approach to the preservation of comments and other non-semantic information from the original Whiley source file would appear to be largely platform independent.
Options
The obvious option is to provide some form of "intermediate language" on which the different backends can build. Given that I already had such a thing, it would be good to avoid the same mistakes. Some thoughts:
It seems realistically, some kind of imperative language with generic structs, taggable unions and explicit coercions makes a lot of sense. It should ideally provide mechanisms for preserving comments. In many ways, it's not that far from C or the
JavaFile
AST we already have and similar to thoughts I was having there. Potentially, we can generate Java/JavaScript/C directly from it rather than having yet another intermediate form (i.e. likeJavaFile
).