Closed aalekseyev closed 2 years ago
Thank you for the clear description and the fix, @aalekseyev !
I'm going to merge this now, since it's obviously correct, but I'd be interested to hear more from @tiash about how bad the slowness was, if he'd like to add more detail.
This PR makes
make_function_pointer
faster by avoiding a full GC cycle on every invocation.The documentation of
caml_alloc_custom
(section 9.2 of the manual) says:So for used=1, max=1 we make a full GC cycle on every call (or within a small constant factor of that). This is clearly excessive and can be very slow.
I think
caml_alloc_custom_mem
can be used to specify the object size more precisely, but I don't know enough about the code to make that change. Additionally, I'm expecting 0-size to be at most a constant-factor error, anyway, because the custom block itself still counts toward the heap usage, so allocating these in a loop will eventually run a GC.We at Jane Street have used the patched version of the code for a long time and it works well, so the code is tested by production use.
cc @tiash in case you'd like to add any background on what motivated the change (e.g. how bad the slowness was)