Open andorp opened 4 years ago
Reference counting may have a higher total overhead than garbage collection. I don't know if re-using allocations saves sufficient time to make it's overhead smaller than garbage collection. I think it would be really useful to benchmark CIB against GC once it is implemented.
There are some preliminary results in the Lean paper; https://arxiv.org/pdf/1908.05647.pdf
About the comparing GC with CIB. We can't do that measurement yet. As we have only a simple runtime implemented in C and LLVM which does only have allocation as memory management.
I agree that it would be nice to have something like a GC. What stops us? Mainly that our focus is currently is not on the LLVM and runtime implementation. Why? We are changing the AST to accomodate datalog as possible way of implementing different kind of HeapPointsTo analysis. The current proof of concept implementation can't handle bigger programs than 70k AST nodes.
The project is based around the idea of whole program analysis of GRIN is equivalent to Modern Pointer Analysis problems. Of course we can opt-out from the whole program analysis and run analysis on procedure or module level as other compilers do and generate LLVM code from that level of optimized code. But our personal goal is apply whole program analysis on industrial sized projects.
I think a simple GC supported GRIN runtime implemented in LLVM would be a nice master thesis for someone, who is interested in such a topic. Even another bachelor's or master's could be the implementation of the CIB runtime in LLVM for GRIN.
Please shout if you know anybody that would be interested doing such a thing. :)
I have been researching, and slowly building a FP oriented concurrent copying GC (Sundial design document) for Rust, one of the project goals is to enable it's use as the foundation of FP language runtimes, I suspect in the long run it would be a good fit. It requires both direct and transitive set of points to type info, in order to achieve pauseless concurrent collection, While Sundial will support vtable/runtime polymorphism, it will come at a higher cost. My interpretation of the docs folder is that GRIN's points to analysis provides this type info?
On a related note, is update
always done with release ordering? Is a more efferent C11 style memory model planned?
On the other hand, reference counting seems to be a prerequisite for in-place mutation optimization that Koka (FBIP) and Roc use. That could be really interesting optimization as well.
I discontinued this development for the sole reason of not having enough time and resources in my life now, and it would require more than I can allocate to it. I am more than happy to discuss my learnings with anyone who would like to pick this up.
I have been researching, and slowly building a FP oriented concurrent copying GC (Sundial design document) for Rust, one of the project goals is to enable it's use as the foundation of FP language runtimes, I suspect in the long run it would be a good fit. It requires both direct and transitive set of points to type info, in order to achieve pauseless concurrent collection, While Sundial will support vtable/runtime polymorphism, it will come at a higher cost. My interpretation of the docs folder is that GRIN's points to analysis provides this type info?
On a related note, is
update
always done with release ordering? Is a more efferent C11 style memory model planned?
Would Grin exploit your GC by compiling to rust code that uses its api?
@GunpowderGuy that was not what I was thinking, but sadly the research was never finished so nothing can use it.
Implementation of the Counting Immutable Beans (CIB) for the GRIN compiler.
Summary
CIB uses instrumentation of the original program. There are four new instructions in the syntax that are inserted via instrumentation. This can be categorized into two;
In the CIB approach every heap location has a reference counter associated with it. Inc increments the counter for the location, and also increments all the referred locations transitively. Dec decrements the counter for the location, and also decrements all the referred locations transitively.
Reset, reuse:
From the CIB paper:
If x is a shared value, than y is set to a special pointer value BOX, otherwise to the heap location associated with x. If x is not shared than reset decrements the reference counters of the components of x, and y is set to x.
If y is BOX reuse allocates a new heap for the constructor. If y is not Box the runtime reuses the heap location for storing the constructor.
Application of the same idea for GRIN:
Differences: meanwhile Lean's IR put every variable on the heap, GRIN uses variables as were registers and only a subset of the registers are associated with heap locations. A register is associated with heap location if its type is Loc. This means the GRIN implementation of the CIB approach needs a type environment which tells which variables can be affected by the CIB operations.
In GRIN:
We need to add 4 new instructions:
x <- reset y;
where y is a heap location, x can be a special heap location, which can be BOX too.z <- reuse x y;
where x is a special heap location created by reset, and y is a Node value.z <- inc x;
where x is a heap location, it transitively increments the reference counters in the locations. Cycle detection should happen. The increment operation computes unit as its return value.z <- dec x;
where x is a heap location, it transitively decrements the reference counters in the locations. Cycle detection should happen. When the counter reaches zero, the runtime must deallocate the location. The decrement operation computes unit as its return value.The GRIN nodes store primitive values, but the runtime makes the difference between location values and primitive values, thus it is able to create the transitive closure of the reachability relation of a location and manipulate its reference counters.
Every of the four instructions needs to be implemented in the GRIN runtime/interpreter.
In the original paper reuse of the constructors could happen only of the arity of the constructors are the same. But in GRIN as the runtime needs to allocate heaps based on the type of the heap location. This means every heap location can have its own arity, and reuse if the heap location is possible only if the new node does not exceeds the arity of the heap node. Otherwise a new node needs to be allocated, with the maximum arity.
The most important change is the
reuse
construction. It changes certain instances of thestore
operation to thereuse
operation.In this case we need to decide to reuse the heap location associated with w only if w can accommodate all the possible values of x. This means the max-arity(w) >= max-arity(x). Meanwhile Lean's approach uses the arity of the constructors in the alternatives, we can use the abstract information of all the possible runs.
Implementation steps: