We need to limit the amount of memory they use, and we want to throw them away if they are no longer useful
We need to make sure that they are correctly disposed of when they are no longer valid.
We shouldn't use much space to map deoptimization events to superblocks
We need to be able to find superblocks when constructing or executing other superblocks, so that we can merge control flow at the superblock level.
The current implementation uses caches that are a multiple of 16 bits. So lets limit the number of superblocks to 2**16.
I'm going to assume for now that this is sufficient even for a very large program with a relatively flat execution profile (I plan to do a proper analysis of this, but we'll assume that it's OK for now).
This means that we can refer to a superblock with just 2 bytes, which should help keep things compact.
Superblocks will hold strong references any objects they rely on: code objects, functions, other superblocks, etc.
This means that superblocks will keep other object alive.
This is OK if we delete cold superblocks.
Any superblock that depends on an otherwise dead object will not be reachable, will grow cold and be eliminated, freeing the other objects.
We can check for cold superblocks in a background thread or at a regular interval using the eval-breaker mechanism.
Scanning ~1% superblocks every few milliseconds of CPU time should be about right.
Determining if a superblock is cold will require an execution counter, which has cost, but should be low provided superblocks are large enough.
Possible data structures.
Table of superblocks
References to superblocks should be kept in an array. Maybe fixed sized, maybe variable sized with a maximum size of 64k (which would take half a MB on a 64 bit machine)
Mapping deoptimization events to superblocks
Conceptually we need a bag, mapping individual events to sets of superblocks. Implementing it a dictionary would be very wasteful, but might be OK for early iterations.
A much more compact approach would be store a bloom filter of events in each superblock. This would produce false positives for deoptimization and be very slow finding traces. With some additional data to speed the search though, this could be good enough.
Finding superblocks from source location.
A simple map (code, offset) -> superblock should work.
We can use a dictionary initially, but a much more compact map is possible:
We can store each entry in 8 bytes. 3 for the code object version, 3 for the offset and 2 for the superblock index.
Superblocks need to be managed.
The current implementation uses caches that are a multiple of 16 bits. So lets limit the number of superblocks to
2**16
. I'm going to assume for now that this is sufficient even for a very large program with a relatively flat execution profile (I plan to do a proper analysis of this, but we'll assume that it's OK for now).This means that we can refer to a superblock with just 2 bytes, which should help keep things compact.
Superblocks will hold strong references any objects they rely on: code objects, functions, other superblocks, etc. This means that superblocks will keep other object alive. This is OK if we delete cold superblocks. Any superblock that depends on an otherwise dead object will not be reachable, will grow cold and be eliminated, freeing the other objects.
We can check for cold superblocks in a background thread or at a regular interval using the eval-breaker mechanism. Scanning ~1% superblocks every few milliseconds of CPU time should be about right.
Determining if a superblock is cold will require an execution counter, which has cost, but should be low provided superblocks are large enough.
Possible data structures.
Table of superblocks
References to superblocks should be kept in an array. Maybe fixed sized, maybe variable sized with a maximum size of 64k (which would take half a MB on a 64 bit machine)
Mapping deoptimization events to superblocks
Conceptually we need a bag, mapping individual events to sets of superblocks. Implementing it a dictionary would be very wasteful, but might be OK for early iterations.
A much more compact approach would be store a bloom filter of events in each superblock. This would produce false positives for deoptimization and be very slow finding traces. With some additional data to speed the search though, this could be good enough.
Finding superblocks from source location.
A simple map
(code, offset) -> superblock
should work. We can use a dictionary initially, but a much more compact map is possible: We can store each entry in 8 bytes. 3 for the code object version, 3 for the offset and 2 for the superblock index.