pre-srfi / static-scheme

11 stars 0 forks source link

Reader #3

Open eraserhd opened 3 years ago

eraserhd commented 3 years ago

I've thought about this sort of thing a few times before, but I always get stuck at this question:

What is the type of read? Or an alternate phrasing: what is the type of source code? Or another alternate phrasing: how can we have both homoiconicity and strong typing? (And if we can't, why keep the parens?)

To be clear, we could have:

type Datum = Nil | Cons Datum Datum | Integer Int | ...

This doesn't seem open to extension, and seems overly general.

mnieper commented 3 years ago

The standard Scheme reader is also not extensible, so having a fixed type wouldn't be too restrictive. What the type you give doesn't allow are cyclic data structures that are supported in (R7RS) Scheme through datum labels (see SRFI 38). For this, we need a type that is able handle graphs and not just trees.

read is, of course, a good gauge for a useful type system.

(Maybe I haven't given a satisfying answer, so please comment.)

johnwcowan commented 3 years ago

In my current proposal, read is a Scheme procedure, and therefore has no inherent type. When called from Steme, the Steme programmer will have to say what type it has (as with any Scheme-only procedure). Indeed, it is one of the good arguments for allowing type declarations that are local to Steme bodies.

If read is not to be precisely defined, then a type Datum would indeed be plausible, referencing exactly what the local Scheme reader can read: this type would be primitive and not defined in Steme.

Macro processing is done before typing, so source code as such doesn't need a type.

mnieper commented 3 years ago

You may want to write procedural macros in statically typed code. For that, source code would need a type, which could include a datum together with a number of related procedures. I don't yet see why the datum type would have to be primitive.

johnwcowan commented 3 years ago

Datum has to be defined by the implementation, because only the implementation knows which types belong to it. Furthermore, are we going to allow objects with arbitrary nesting depths to be typed?

Fortunately, datum labels can only appear in literals, not in arbitrary code, per R7RS-small.

mnieper commented 3 years ago

For portable programs, the datum type has to be definite. Otherwise, I cannot write down a case/match construct that is guaranteed exhaustive (which may be checked by the type checker). (It's not too different in R7RS programs: The read procedure returns objects of a number of different types by the specification. If some implementation extends the read procedure by returning non-standard datums, valid programs may suddenly crash on certain input data where they would have expected a read error instead.)

What do you mean by objects with arbitrary nesting depths? I would have viewed lists as such. And we definitely want to be able to give (homogeneous) lists a type. So algebraic data types have to allow some kind of recursion: A list of type a is either nil or the cons of an object of type a and a list of type a.

johnwcowan commented 3 years ago

Because read is impure, it can't be called from Steme, so I'm closing this issue. Reopen if you think I shouldn't have.

mnieper commented 3 years ago

The original question isn't related to the impurity of read. It could also have been phrased as: What is the type of read-string, where read-string takes a string and parses it as a datum. (Apart from that, by threading a world state object through read, it could also be made pure.)

I am therefore reopening the issue.