CordyJ / Open-TuringPlus

Open Turing+ programming language compiler
Other
8 stars 2 forks source link

Issue with @, malloc, and type inference #11

Closed dnewhall closed 4 months ago

dnewhall commented 4 months ago

I'm trying to port some Object Oriented Turing code to Turing Plus (I want to run it on UNIX vs. Windows). It does some...let's call it "creative" things to circumvent the type system so as to implement a form of polymorphism with record types.

Turing Plus is surprisingly much harder to use than OOT because you are required to always use collections with pointers whereas you can simply do new recType or var p : pointer to recType in OOT without specifying a collection.

To get around this I thought, "Oh, since I can access the entire C stdlib without problem in Turing+, I'll just use malloc!"

My first pass at the code was roughly this:

external function malloc(s : int) : addressint

type Point :
record
    typeId : int
    x : int
    y : int
end record

var p := malloc(size(Point))
Point @ (p).x := 5
put Point @ (p).x

But when I ran the code, I got a segfault and couldn't figure out why:

$ tpc malloctest.t && ./malloctest.x
clang: warning: argument unused during compilation: '-ansi' [-Wunused-command-line-argument]
28524592

Line 11 of malloctest.t: Segmentation violation.

However, it becomes fixable if I make sure to specify again that p is an addressint:

var p : addressint := malloc(size(Point))

Then it works:

$ tpc malloctest.t && ./malloctest.x
clang: warning: argument unused during compilation: '-ansi' [-Wunused-command-line-argument]
19103792
5
CordyJ commented 4 months ago

That's because you didn't give a type for var p but you assigned it a numeric expression, so according to the type rules of Turing, it defaults to type int. You meant it to be addressint, and on 64-bit architectures, it matters.

dnewhall commented 4 months ago

Why would an expression whose type is declared as addressint be considered a numeric expression/another type in this case? If it was a mathematical expression, I could see it defaulting to int or something, but this was an assignment from a function call whose return type is explicitly declared as addressint.

CordyJ commented 4 months ago

Turing Plus is surprisingly much harder to use than OOT because you are required to always use collections with pointers whereas you can simply do new recType or var p : pointer to recType in OOT without specifying a collection.

Turing+ and OOT are unrelated, other than the fact that both are extensions to the original Turing programming language.

Unlike OOT, Turing+ is not designed to be an object-oriented, teaching or general purpose programming language. It's more than a decade older than OOT, and it was designed to be used to write operating systems, compilers, interpreters and other systems software as a replacement for C. The original implementation of OOT was written in Turing+.

Both Turing and Turing+ were designed for programs to be formally verified using the Turing proof rules in Hoare logic, which is why they (originally) disallowed aliasing and constrained pointers to collections. The addressint and Type@ notation was added to Turing+ to make it easier to code direct access to hardware in low-level drivers; it was not originally intended to be used for general programming.

CordyJ commented 4 months ago

Why would an expression whose type is declared as addressint be considered a numeric expression/another type in this case? If it was a mathematical expression, I could see it defaulting to int or something, but this was an assignment from a function call whose return type is explicitly declared as addressint.

You'd have to read the formal definition of Turing (https://www.amazon.com/exec/obidos/ASIN/0139331360/acmorg-20) for the type inference rules, but the short answer is that when Turing was designed 64-bit machines did not exist. int was the largest and most inclusive integer type, so all integer expressions up-typed to int in order to lose no significance.

Turing+ inherited all the rules of Turing because it's an extension, not a modification, of Turing, and if you write a Turing program in Turing+ it has to do exactly the same thing as it does in Turing. When Turing+ was designed and addressint was added, 64-bit machines still did not exist, so it still didn't matter much.

We're not designing a language here, we're just implementing a compiler for it. If we were to redesign Turing+ now, addressint would have to have its own, separate typing rules. But even so, it's not clear how to keep the language consistent with Turing's type rules at the same time. For example, what types are these?

var diskreg := 16#0080
var keyreg := 2564
CordyJ commented 4 months ago

Having said all that, it would be good to fix this in OpenT+. But we need to be clear on what the inference rules are. The original case is easy; we just need to define all the operator rules for addressint interactions with other types and see if they can be implemented consistently with the Turing rules for the other numeric types.

dnewhall commented 3 months ago

I actually have a copy of that book :)

I guess it does match the semantics seen elsewhere:

var someVar := 0
fcn foo : addressint  result addr(someVar)  end foo

var adr1 := foo
int @ (adr1) := 12
put adr1, ": ", int @ (adr1)

That also fails with a segfault at line 4 if adr1 isn't declared as an addressint.