GC Pinning - Githubissues

Aidan63 commented 10 months ago

Makes the MemType enum and GetMemType function accessible from the hx namespace as mentioned in #1082.

hughsando commented 10 months ago

I think there are some complications with using the GC memtype routines outside of a collection (ok in __Mark). For example, the large list might change - it is locked inside a collect call, but not in general. On the plus side, if you know it is a valid pointer inside a haxe object, then you can rely on the header bits, and it should not be memUnManaged except in case of Array.mAlloc=-1 .

With array data, it is possible that the the pointer is a foreign pointer (Set with NativeArray.setUnmanagedData). In this case, the arrays mAlloc will be -1, and no memory management on the pointer is needed - this should be checked before examining the header.

You can check for 'const-ness' with bool IsConstAlloc(const void *inData). In this case, the data does not need any management at all (eg, string data) and will not be deleted.

largeness can be determined from a valid header from the size ("size in allocated blocks"), which will be zero in this case

 int size = flags & 0xffff;
 // Size will be 0 for large allocs -> no need to mark block

You may need to add some code to prevent large allocations from getting destroyed in a realloc process, see InternalRealloc and InternalReleaseMem. The theory is that only 1 array will hold the byte data, so it can be cleaned when the array is resized. This may need some thought, but one possibility could be set the least-significant for the large array size to 1 in the case where we do not want immediate reallocation.

If you are storing GC block pointers in a haxe object, some additional work must be done the in the generational case, HX_OBJ_WB should be called. This is for the case where the object holding the pointer is "old" and the pointer is generationally new. This call means your __Mark will get called when doing a minor generational collection.

Aidan63 commented 10 months ago

Thanks for the tips, I see what you mean about the large list locking.

Perhaps it makes sense to come at this from a different angle then, if arrays had a pin method which returned some sort of handle it could keep track if a pin was requested and not attempt that realloc optimisation if it was. That pin method could do all the checking to see if mBase is unmanaged, lives on the LOH, or in a block and act accordingly. The handle could keep a rooted marking object which calls hx::MarkAlloc in its __Mark to keep that object alive until the handle is released.

Aidan63 commented 10 months ago

OK, I've completely re-jigged this merge. MemType changes have been reverted and it adds pinning spport. Pinning and unpinning checks the header to see if it lives on the LOH and BlockDataInfo now keeps track of the number of allocs in its lines and sets mPinned accordingly. Array now has a pin function which will return a handle keeping the array storage at time of pinning alive and un-moving until the handle is deleted. Hopefully this is less dodgy.

hughsando commented 10 months ago

Probably better to use the atomic inc/dec ops for mPinned manipulation if pinning might happen from any haxe thread. It would be better if possible to not increase the size of the Array class since this will be allocated a lot, and pinning will happen much less often. I was thinking a std::unordered map of pinned pointers might do the trick. These could be marked like the conservative marks - pinning the blocks at mark time. If you reuse the mLargeListLock for the pinned set you could solve the realloc problem by checking the pinned set inside FreeLarge for almost no cost. Marking the list like conservative pointers also solve the generational gc problem. You still might need your pinned structure to allow unpinning of exactly the same pointer that was pinned (since the array might realloc in the meantime). You could possibly just use a cpp.Star pointer in a normal haxe structure to do this though.

Aidan63 commented 10 months ago

I thought of the atomic operations just as I was going to bed, I've changed things around now though so that mPinnedAllocs isn't needed any more. There is now an sgPinSet of void* pointers which is protected by the mLargeListLock of the global allocator. Just before the roots are traversed this set is iterated over and any non object not on the LOH has its block pinned. FreeLarge now checks the pin set after aquiring mLargeListLock and exits early if the pointer to be released is in the set. One thing I have noticed is the shrinking case of arrays, it doesn't re-allocate but just updates the length value. However, it also memsets the now extra bytes to 0 which is a problem when wanting to pass that data around.

Another thing I'm unsure of, when marking the pins do I need to do the whole process of checking the pointer type and then calling either mark object, mark string, etc, as is done in the conservative marking function, or is it just to just call hx::MarkAlloc as I'm doing now.

hughsando commented 10 months ago

This seems good. Changing the length is the same as changing, say, the first element - more of a documentation issue if it does not crash. The only alternative is to copy, since there is not real way to do copy-on-write without incurring overhead. The onus is on the user to ensure that the unmanaged pointers remain valid, so no problems there.

Aidan63 commented 10 months ago

Yeah, thats fair. The only way to solve it without the extra copy would probably be to introduce an extra BytesData type instead of that being a typedef to an array. Since this is primarially going to be used with haxe bytes that would stop the user peeking at the underlying implementation and resizing it. But thats a lot of effort (and a breaking change) for something which the user should know is asking for trouble (haxe bytes don't expose any way in the formal API for resizing).

Aidan63 commented 9 months ago

Probably should have opened this as a draft to begin with. I'm not planning on adding anything else so if this looks good its probably good to merge.

HaxeFoundation / hxcpp

GC Pinning #1085