Handle incremental GC with maxOldSpaceSize set and full

akgrant43 commented 2 months ago

Describe the request

Currently the VM will print an error and exit when the following happens:

VirtualMachine>>maxOldSpaceSize: is set
Old space is full
An incremental GC requires migrating objects from eden to old space.

For normal object allocation (Behavior>>basicNew:), if there isn't enough space after attempting a GC and growing old space, an OutOfMemory error is signalled, giving the image a chance to tidy up and retry.

As can be seen in SpurGenerationScavenger>>copyToOldSpace:bytes:format:, if the memeory manager isn't able to grow and allocate the required space, an error is printed and the VM exits. This unfortunately doesn't give the image (application) a chance to do a GC or take other action.

The proposed enhancement is to allow the scavenger to allocate additional old memory beyond maxOldSpaceSize, a subsequent call to Behavior>>basicNew[:] would then fail, triggering the same flow as when hitting maxOldSpaceSize.

The amount of old space allocated could either be fixed, e.g. one or two times eden size, or enabled and configured (sized) with a separate VM parameter.

Expected behavior

The VM successfully migrates objects from eden to old space and gives the image a chance to tidy up (free up objects if necessary, and do a GC, or notify the user and exit gracefully.

Expected development cost

A first guess is a few days work (for someone already familiar with the relevant VM code):

(possibly) add a VM parameter for how much maxOldSpaceSize may be exceeded by the scavenger.
Extend growOldSpaceByAtLeast: with a flag indicating that the scavenger is requesting the memory and that maxOldSpaceSize may be exceeded.
(possibly) add a flag to the memory manager so that the next object allocation request (Behavior>>basicNew[:]) fails.
Automated tests.

Version information:

OS: All platforms (demonstrated on linux and Windows)
Version: Windows server 2010, NixOS 24.05.
Pharo Version: 11.

fedemennite commented 2 months ago

Hi, thanks for figuring this bug out. This is a major issue for us as it is preventing us to upgrade to Pharo 11. So in our view it's more bug than and an enanchement request. Thanks, Fede

fedemennite commented 1 month ago

This has likely been fixed after an ESUG session between Pablo and Alistair. Fix is integrated according to Pablo. @akgrant43 can you kindly confirm?

akgrant43 commented 1 month ago

The fix we made at ESUG was only improving the messages printed prior to the VM aborting. The fix proposed above is yet to be done.

pharo-project / pharo

Handle incremental GC with maxOldSpaceSize set and full #16900