Ravenbrook / mps

The Memory Pool System
http://www.ravenbrook.com/project/mps
Other
557 stars 75 forks source link

mps_pool_walk fails on pools containing other than exact references #195

Open rptb1 opened 1 year ago

rptb1 commented 1 year ago

m. Why RankEXACT? Probably has no effect, but if so, we should say so. rule.generic.clear

_Originally posted by @rptb1 in https://github.com/Ravenbrook/mps/pull/34#discussion_r1123320599_

poolWalk creates a kind of fake trace so that it can re-use SegScan to scan segments to visit formatted objects. SegScan requires a scan state, and scans at the rank determined by that scan state. poolWalk always creates a scan state with RankEXACT, regardless of the actual ranks present on the segments in the pool. This instructs the pool to only scan exact references, which could cause it to skip objects, making the walk incomplete. https://github.com/Ravenbrook/mps/blob/8635e900f1ca1cbd59b958c10c6b609d9b404457/code/walk.c#L449

None of the current pool implementations will skip objects, but the assumption breaks rule.code.assume and will break if we implement any mixed rank segments.

rptb1 commented 1 year ago

An important part of fixing this is better documentation of what rank means to SegScan.

(The implementation of poolWalk is already inconsistent with design.mps.seg.method.scan which states that it "scans all the grey objects". poolWalk does not greyen the objects and assumes the pools will scan them anyway.)

rptb1 commented 1 year ago

The job of the abstract scan function scan(object, traces, rank, fix) is to apply the abstract fix function to all references of a certain rank that are grey for the traces. SegScan is an implementation of that function for segment objects. Abstractly, a scanner reveals nothing about the internal representation of the object.

It so happens that for pools that support formatted objects, SegScan uses FormatScan to implement the abstract scan function for its formatted objects. To do so, the implementation visits the objects that contain grey references for the rank and call FormatScan on each one. But even assuming that breaks rule.code.assume.

Clearly, we don't want to duplicate the code that visits the formatted objects in a segment.

So it seems to me that, for each formatted pool, we want to design a shared visitor function that can be called by that pool's implementation of SegScan and also by poolWalk. And also SegWalk.

We have to be careful about performance because scanning is step 1 of the critical path.