Open wks opened 2 years ago
The following types use inheritance.
- Plan: concrete plans extend
CommonPlan
extendsBasePlan
- Space: concrete spaces extend
CommonSpace
- ProcessEdgesWork: Any concrete ProcessEdgesWork always contains a
ProcessEdgesBase
- PageResources: Any concrete page resource contains a
ComonPageResource
These are using the same pattern (ProcessEdgesWork
is slightly different):
Common
component, and let each struct include the componentCommon
type, and will access it in default implementations.The difference for ProcessEdgesWork
is that is assume it can be dereferenced into ProcessEdgesBase
, so it can use self.xx
to access fields in the base. Other types use self.common()
to access the common component. These two patterns are mostly interchangeable.
We do not intend to use inheritance. Those types are just composed by the Common
component. If we refactor CommonPlan
to CommonSpaces
, refactor BasePlan
into BaseSpaces
along with other global states, is that more obvious that it is the composition pattern?
Methods for MMTK accidentally added to
BasePlan
BasePlan
is special. It comes from thePlan
abstract class in the JikesRVM version of MMTk. Because Java MMTk didn't have the global instance typeMMTK
, globally shared data fields are fields of the Plan class.Most fields in
BasePlan
are not really plan-specific. For example,
BasePlan::initialized
indicates if MMTk is initialised. No matter what plan we use, MMTk always needs to be initialised.BasePlan::options
is the options of the MMTK instance set by the command-line or environment variables. They are not plan-specific.BasePlan::gc_requester
is the synchroniser where mutator threads can notify GC threads to start GC. All plans need this.The only few fields that should conceptually belong to
BasePlan
are the three spaces, namelycode_space
,code_lo_space
andro_space
because a plan contains many spaces.Those shared fields should have been fields of
struct MMTK
.
Mostly correct. But plan should include not only plan-specific code, but also general code that is applicable to all the plans. Like the gc_requester
in your example, if all the plans use it, it should be a part of the plan. options
is shared by MMTK
and Plan
. Plan needs to access it so it is reasonable that the plan holds a reference to it.
Deref
According to Rust doc: https://doc.rust-lang.org/std/ops/trait.Deref.html
... On the other hand, the rules regarding
Deref
andDerefMut
were designed specifically to accommodate smart pointers. Because of this,Deref
should only be implemented for smart pointers to avoid confusion.In mmtk-core, the following types implement
Deref
and/orDerefMut
.Group 1:
InitializeOnce<T>
(toT
)Group 2:
UnsafeOptionsWrapper
(toOptions
)MMTKOption<T>
(toT
)Group 3:
GenNurseryProcessEdges
(toProcessEdgesBase
)SanityGCProcessEdges
(toProcessEdgesBase
)SFTProcessEdges
(toProcessEdgesBase
)FreeListPageResource
(toCommonFreeListPageResource
)- Type defined by
define_vm_metadata_spec
(toMetadataSpec
)Group 1, i.e.
InitializeOnce<T>
, is a proper use ofDeref
, becaueIntializeOnce<T>
is a kind of smart pointer. It is responsible for resource management.In Group 2,
UnsafeOptionsWrapper
andMMTKOption
work around static initialisation. See #541Group 3 contains types that attempt to implement inheritance of object-oriented programming in Rust. Those belong to the category discussed here.
Right. For ProcessEdgesBase
, as I mentioned above, it can be replaced with the 'common' pattern. For MetadataSpec
, if it is improper to use Deref
, we can just remove it. It already provides as_spec()
.
These are using the same pattern (
ProcessEdgesWork
is slightly different):
we put common fields/code to the
Common
component, and let each struct include the componentthe trait also assumes there is a
Common
type, and will access it in default implementations.
Yes. If you put it this way, it looks like it is still a valid use of "Common" types.
Maybe the delegate crate can make this pattern easier to implement.
We do not intend to use inheritance. Those types are just composed by the
Common
component. If we refactorCommonPlan
toCommonSpaces
, refactorBasePlan
intoBaseSpaces
along with other global states, is that more obvious that it is the composition pattern?
Yes and no. Yes because a concrete plan does customises the spaces it has. If we refactor Immix
into, for example, ImmixSpaces
which has a CommonSpaces
which has a BaseSpaces
, we make it clear that ImmixSpaces
only customises the spatial composition of the Immix GC algorithm. Many methods, besides trace_object
, can be automatically "derived" from the list of spaces using macro, for example:
fn prepare(&self) {
self.space1.prepare();
self.space2.prepare();
...
self.parent_spaces.prepare();
}
fn release(&self) {
self.space1.release();
self.space2.release();
...
self.parent_spaces.release();
}
fn used_pages(&self) {
let mut sum = 0;
sum += self.space1.used_pages();
sum += self.space2.used_pages();
...
sum += self.parent_spaces.used_pages();
sum
}
Of course we can still do it even without isolating XxxSpaces
.
The "no" part I mentioned above is because a GC algorithm also gives temporal advices to the process of GC. For example, MarkCompact adds a forwarding phase, while Immix behaves differently when defragmenting. So a concrete GC algorithm is not only a collection of Spaces, but also other customisations.
So maybe we can keep the "concrete plan has a CommonPlan has a BasePlan" structure.
Mostly correct. But plan should include not only plan-specific code, but also general code that is applicable to all the plans. Like the
gc_requester
in your example, if all the plans use it, it should be a part of the plan.options
is shared byMMTK
andPlan
. Plan needs to access it so it is reasonable that the plan holds a reference to it.
I don't agree with the statement "if all the plans use it, it should be a part of the plan". In OOP, probably yes, because an instance hold references to objects it uses. But in Rust, having a field in a struct means the struct owns that thing. Personally, I think a Plan
instance should only own things that are specific to a GC algorithm.
If all plans use X in method M, we can pass a reference of X to method M as a parameter. Remember that GCWorker
has a field mmtk: &'static MMTK<VM>
, and GCWork::do_work
has mmtk: &'static MMTK<VM>
as its last parameter. In the worst case, we can pass the &MMTK
reference to method M so it has access to everything in the MMTk instance.
The difference for
ProcessEdgesWork
is that is assume it can be dereferenced intoProcessEdgesBase
, so it can useself.xx
to access fields in the base. Other types useself.common()
to access the common component. These two patterns are mostly interchangeable. ... Right. ForProcessEdgesBase
, as I mentioned above, it can be replaced with the 'common' pattern. ForMetadataSpec
, if it is improper to useDeref
, we can just remove it. It already providesas_spec()
.
Yes.
@qinsoon After a second thought, I think the relationship between "GenImmix -> Gen -> CommonPlan -> BasePlan" is more like a "chain of responsibility". They are all designed to fullfill the role of a Plan, but
collection_required
to support generational GC for all generational plans, while non-generational plans delegate the decision to CommonPlan, which in turn delegate to BasePlan.Seeing from the original JikesRVM code, we can find that some fields/methods of Base plan are static, while other are not. For example
isEmergencyCollection
, isInitialized
, stacksPrepared
, gcInProgress
, handleUserCollectionRequest
...isCurrentGCNursery
, getPagesUsed
, willNeverMove
, ...Static methods cannot be overridden, and are not part of the chain of responsibility. They probably belong to a different global structure.
Personally, I think a Plan instance should only own things that are specific to a GC algorithm.
I agree with this and often have found it vexing that "global" state has to be added to BasePlan
instead of the MMTK
struct, which would be the natural place to put global state in my opinion. We don't have a global handle on the MMTK
object in mmtk-core, which makes it impossible to use the MMTK
struct for global state.
If we refactor
Immix
into, for example,ImmixSpaces
which has aCommonSpaces
which has aBaseSpaces
Immix
would have ImmixSpace
and CommonSpaces
. This would be more obvious that it is composition. Currently we have CommonPlan
rather than CommonSpaces
.
a GC algorithm also gives temporal advices to the process of GC. For example, MarkCompact adds a forwarding phase, while Immix behaves differently when defragmenting. So a concrete GC algorithm is not only a collection of Spaces, but also other customisations.
Plan is not a 'GC algorithm'. A policy is a GC algorithm. A plan just puts spaces of different policies together, and make them in a certain way. The MarkCompact
needs a forwarding phase, as MarkCompactSpace
requires it.
@k-sareen Actually, we always do. Both GCWorker
and GCController
has a &MMTK
field.
Then it comes to mutators. Mutator itself doesn't have a reference to MMTK
, but each of its allocators has a reference to &dyn Plan
. It is used in allocation slow paths to trigger GC.
The fun part is, when I look up all call sites of Allocator::get_plan
(returns &'static dyn Plan<VM = VM>
), I found that it either immediately calls base()
, like this:
fn alloc_slow_inline(&mut self, size: usize, align: usize, offset: isize) -> Address {
let tls = self.get_tls();
let plan = self.get_plan().base();
or call a method in trait Plan
that only accesses fields in BasePlan
, like this:
// Only update the allocation bytes if we haven't failed a previous allocation in this loop
if stress_test && self.get_plan().is_initialized() && !previous_result_zero {
let allocated_size =
and this:
emergency_collection = self.get_plan().is_emergency_collection();
It clear indicates that all that Allocator
is interested in are some global GC states, not anything related to the GC algorithm. So Allocator
should hold a reference to the shared global GC state, not &dyn Plan
.
Right, I was talking about using MMTK
in the mutator part of the code (such as in Allocator
). It's not possible since no API will provide the MMTK
object to the mutator.
TL;DR: Several types in MMTk core use inheritance in OOP. This is not idiomatic in Rust. In the long term, we should switch to a more idiomatic style, such as composition instead of inheritance.
In practice, we should tread lightly, and do refactoring in small steps. Not all such cases need to be eliminated, because there may not always be a better solution.
Examples:
The following types use inheritance.
CommonPlan
extendsBasePlan
CommonSpace
ProcessEdgesBase
ComonPageResource
Analysis
Inheritance vs composition
An instance of a concrete type contains an instance of an abstract type because the abstract type holds common information shared by multiple concrete types. For example, all spaces should have name.
Conversely, inheritance also means each subclass customises the abstract class in a certain way. For example, all
ProcessEdgesWork
have an incoming queue edges to be processed, and an outgoing queue of objects to be scanned, as shown inProcessEdgesBase
, but each concreteProcessEdgesWork
customises the way how to process each edge.If we view the "inheritance" relation as a form of "customisation", we may think that there is only one kind of ProcessEdgesWork, but each instance allow some behaviour to be customised. And there are many different ways of customisation besides OOP-style inheritence. One possibility is composition. If we consider
trace_object
as a responsibility ofPlan
, then all ProcessEdgesWork are the same. ProcessEdgesWork only calls into Plan to do the actual tracing.Both https://github.com/mmtk/mmtk-core/pull/575 and https://github.com/mmtk/mmtk-core/pull/570 attempt to eliminate plan-specific ProcessEdgesWork implementations. They are examples of replacing inheritance with composition.
Methods for MMTK accidentally added to
BasePlan
BasePlan
is special. It comes from thePlan
abstract class in the JikesRVM version of MMTk. Because Java MMTk didn't have the global instance typeMMTK
, globally shared data fields are fields of the Plan class.Most fields in
BasePlan
are not really plan-specific. For example,BasePlan::initialized
indicates if MMTk is initialised. No matter what plan we use, MMTk always needs to be initialised.BasePlan::options
is the options of the MMTK instance set by the command-line or environment variables. They are not plan-specific.BasePlan::gc_requester
is the synchroniser where mutator threads can notify GC threads to start GC. All plans need this.The only few fields that should conceptually belong to
BasePlan
are the three spaces, namelycode_space
,code_lo_space
andro_space
because a plan contains many spaces.Those shared fields should have been fields of
struct MMTK
.Deref
According to Rust doc: https://doc.rust-lang.org/std/ops/trait.Deref.html
In mmtk-core, the following types implement
Deref
and/orDerefMut
.Group 1:
InitializeOnce<T>
(toT
)Group 2:
UnsafeOptionsWrapper
(toOptions
)MMTKOption<T>
(toT
)Group 3:
GenNurseryProcessEdges
(toProcessEdgesBase
)SanityGCProcessEdges
(toProcessEdgesBase
)SFTProcessEdges
(toProcessEdgesBase
)FreeListPageResource
(toCommonFreeListPageResource
)define_vm_metadata_spec
(toMetadataSpec
)Group 1, i.e.
InitializeOnce<T>
, is a proper use ofDeref
, becaueIntializeOnce<T>
is a kind of smart pointer. It is responsible for resource management.In Group 2,
UnsafeOptionsWrapper
andMMTKOption
work around static initialisation. See https://github.com/mmtk/mmtk-core/issues/541Group 3 contains types that attempt to implement inheritance of object-oriented programming in Rust. Those belong to the category discussed here.