Closed qinsoon closed 7 months ago
/// Return upper bound of the nursery size (in number of bytes)
pub fn get_max_nursery_bytes(&self) -> usize {
self.nursery.max.unwrap_or_else(|| {
if !vm_layout().force_use_contiguous_spaces {
DEFAULT_MAX_NURSERY_32
} else {
DEFAULT_MAX_NURSERY
}
})
}
This check is removed. I am not sure about the consequences. Maybe I should change the default value for max nursery, depending on the VM layout.
I quickly tried h2o
, for which generational Immix plans were reported slower than full heap Immix in https://github.com/mmtk/mmtk-core/issues/594#issuecomment-1913851147. By changing the min nursery size, generational immix can perform better than Immix. I will do more thorough performance evaluation before making this PR ready for review.
The performance results for now (using 2.5x minheap from temurin-21-G1-n-5
):
Normalized to Immix/master plotty
Normalized to GenImmix/master plotty
Normalized to Immix/master plotty
Normalized to StickyImmix/master plotty
jython
and tomcat
shows slowdown for GenImmixh2
, jython
and lusearch
show slowdown for StickyImmixtomcat
. h2
.Hmm. {Gen,Sticky}Immix
are still really bad in comparison to Immix
for lusearch. That doesn't make any sense. Looking back at my old results, it seems like yeah, that was the case back then as well.
I fixed a few mistakes based on the previous evaluation. The following is the new results. I ran with 2.5x of the G1 heap size (temurin-21-G1-n-5
) with 5 iterations.
The results show that the PR is an improvement for both GenImmix and StickyImmix in most benchmarks, 8% improvement for GenImmix and 3% improvement for StickyImmix (using 0.25x heap size as the min nursery). However, both plans are still performing worse than Immix (2% and 1.5% slower, respectively). There are still other performance issues with our generational plans that should be looked into.
The results also show slowdown in tomcat
. The G1 min heap for tomcat
is 19M, and the min heap for MMTk is 42M (GenImmix) and 59M (Immix) in my measurement. So with 2.5x of the G1 min heap, we are almost running at the min heap for Immix plans. With a proportion of the heap size as the min nursery, we always trigger full heap GCs for tomcat
, while with a fixed 2M min nursery (as in master
), we may occasionally do nursery GCs. However, this is a separate issue about MMTk's min heap with OpenJDK.
LGTM. But we should make an issue for resolving the virtual memory issue. Also did you get the performance results in the end?
I am running the benchmarks. I will post the results once I got it.
The improvement for GenImmix is 9.5% and the improvement for Stickyimmix is 3.1% (compared to the 8% and 3% improvement in the previous evaluation). However, one issue is that h2
panics in the run, as it sets the heap size as 1700M, and we cannot guarantee another 1700M virtual memory for the 100% max nursery.
If the nursery size is set explicitly, I think panicking may be a good idea when we cannot satisfy the virtual memory for the nursery. But when we use the default nursery size, we would expect it run, rather than panicking.
Performance-wise, I think this PR achieved what we want.
... But when we use the default nursery size, we would expect it run, rather than panicking.
Yes. If the user doesn't specify a nursery size, we shouldn't blame the user for providing a bad value. One way to solve this is to provide another option "force_nursery_size" which defaults to false. When true, it will panic if the give nursery size cannot be satisfied; when false, MMTk core will adjust the size to the nearest possible value.
... But when we use the default nursery size, we would expect it run, rather than panicking.
Yes. If the user doesn't specify a nursery size, we shouldn't blame the user for providing a bad value. One way to solve this is to provide another option "force_nursery_size" which defaults to false. When true, it will panic if the give nursery size cannot be satisfied; when false, MMTk core will adjust the size to the nearest possible value.
I am thinking about adding NurserySize::Adaptive
and use that as the default value. It tries to use [25%, 100%) heap size as nursery, and when it is not possible, it will be adaptive and use whatever that works (e.g. use a smaller extent for nursery virtual memory, or use discontiguous nursery).
The performance looks good (better than master for most things) but still not much better than Immix which is odd. But this PR does not need to solve all the issues at once.
I agree with @wks and your Adaptive
idea.
We checked with Steve. The reason that nursery was using a contiguous space is that we can do range check for mature objects in barriers. However, this is not important for now. I changed the code to use discontiguous nursery. I am running another round of evaluation.
The results with a discontiguous nursery.
The results are pretty similar to the previous run (with contiguous nursery), but slower. The improvement is 8.7% for GenImmix and 2.4% for StickyImmix, compared with 9.5% and 3.1% in the previous run. There is a consistent 0.8%/0.7% difference.
The PR currently uses discontiguous nursery. But I can change it to contiguous nursery, and only use discontiguous nursery if we may run out of virtual address. Just let me know if that's desired.
@wks @k-sareen Can you review this PR again if you think it is necessary? I would like to get issues resolved for this PR, and get it merged.
This PR introduces different kinds of nursery size options, and by default, we use a proportion of the heap size as the min nursery. This PR should generally improve the generational plans' performance by triggering a full heap GC more promptly. This PR mitigates the issue identified in https://github.com/mmtk/mmtk-core/issues/594, but does not fully fix the problem.