vnmakarov / mir

A lightweight JIT compiler based on MIR (Medium Internal Representation) and C11 JIT compiler and interpreter based on MIR
MIT License
2.31k stars 148 forks source link

Suggestion for api design #5

Closed dibyendumajumdar closed 4 years ago

dibyendumajumdar commented 5 years ago

One of the good ideas in LLVM, and also NanoJIT is that all api calls return pointers; and take arguments that are pointers. So that a front-end can use a simple scalar type to hold instructions, operands, types etc.

I see the MIR has several different types ... op, item, insn, module etc. I would suggest not exposing these at the api level; instead the api can use opaque pointer types. This will also give MIR more flexibility in future design as users will not need to know about the internal structure of these various values.

Certainly from a front-end point of view - in dmr_C I have multiple backends. The front-end does not know about the backends, and having concrete types to deal with is a problem.

It does appear that mostly MIR is also using pointer types, except for a few cases?

vnmakarov commented 5 years ago

I am aware about hiding some internal details by API. But it is lesser concern for me right now.

Using a lot of pointers can slow down the code. We have this problem (pointer chasing) in GCC, where everything in RTL is lists (LLVM at least uses vectors in many parts of IR data) and pointers.

It also creates sharing problem. A few famous bugs in GCC were when people change value through pointers not realizing that the value is shared through another pointer. Therefore I specifically made op a structure for easy copying and no sharing although using pointers and sharing pointed data or/and more compact representation of some type operands would require less memory space.

In general, I am agree more work needed for API. Therefore I wrote it will change as MIR matures.

dibyendumajumdar commented 5 years ago

LLVM's api seems to be type-safe despite using pointers. It has a nice hierarchy of types, and everything is a Value. But if you use the wrong type then LLVM will invoke abort at runtime - at least when assertions are enabled.

dibyendumajumdar commented 5 years ago

I'll probably wait for the C to MIR translator as I can already generate C code from Ravi.