theseion / Fuel

Fuel, the Smalltalk object serializer
https://theseion.github.io/Fuel
MIT License
27 stars 13 forks source link

Cannot meaningfully serialize ephemerons #288

Closed theseion closed 10 months ago

theseion commented 10 months ago

@marcusdenker I've discovered (thanks to @jecisc) that my implementation for ephemeron serialization doesn't work. It works mostly but can fail due to a race condition. The issue is that ephemerons remove themselves from their container, which means that the object graph changes during serialization, and this can happen at any time. With weak objects we were able to work around this because the only change was that an object was replaced with nil. With ephemerons, however, the size of collections can change, which makes the problem much harder.

I currently see three options for ephemerons that are prone to finalization (ephemerons with strong references to the key are no problem, obviously):

  1. discard all ephemerons during serialization; I would probably go the easy way and replace them with nil, otherwise I would have to invent logic for all collections to ignore ephemerons, such that the serialized collections would be smaller and did not contain any ephemerons
  2. create strong pointers to all ephemeron keys in the analysis phase. There's still a potential race condition here, when the GC kicks in right when Fuel is processing the references of a collection.
  3. check for all collections whether they contain ephemerons and if they do, use a specialized cluster to serialize the collection; this cluster could then take care of ensuring that strong pointers exist for all elements and that the collection can no longer change

Option 3 is probably the best solution (the specialised cluster could potentially still dispatch to other, optimized clusters for special collections). But it will be quite some work to get it done. Which is why I favour option 1. It is relatively simple to implement and there's no big semantic difference to serialization of weak objects, except that weak objects could potentially be serialized where ephemerons would always be replaced with nil.

If I implement option 1, implementing option 3 later is still possible.

What's your opinion? Is there a use case that makes it not only desirable but necessary to have ephemerons in the materialized graph?

theseion commented 10 months ago

I just realised that option 1 suffers from the same issue as option 2: there's still a possibility that the GC kicks in between collecting the references of a collection during the analysis phase and the serialization phase, in which case the collection would shrink.

So currently, I only see option 3 (obviously, I could iterate over all ephemerons before starting and create strong pointers, but I don't think that's even worth discussing, mainly because it would apply to every single serialization run).

theseion commented 10 months ago

I found a way to implement support and opened a PR for Pharo (https://github.com/pharo-project/pharo/pull/15670). That should fix the random failures on the CI.

tinchodias commented 10 months ago

@theseion Hi Max: thanks for opening the issue to discuss about it. Why do ephemerons can't take option 1 as old weak objects? is it that the container can be serialized in an inconsistent state?

With ephemerons I've been kind of stubborn or too lazy to read the paper, and feel I don't grasp well them... I need to run some scripts in some Workspaces to get how they work in practice.

But my first paragraph comes from the folloeing assumption in my mind (may be wrong): in the case of weak objects, the any object that referenced (or "contained") a weak object had to check for nil always, since it might been GCed in any time without notification. In the case of ephemerons this is different as they have a "mouning" block that trigger some clean up action in the referencer/container, so fue fuel is not as easy as substituting them by nil. Do you know if I am wrong?

theseion commented 10 months ago

Yes, that's exactly right. An ephemeron has a reference to its container and will remove itself from its container during finalization. While this is really cool for users (no nil checks), it's difficult for Fuel because the container can change during serialization. One effect can be that, during serialization, you have ephemerons in the references of a dictionary but the dictionary is already empty, which will cause the root object of the materialization to be different because not all references will have been read from the graph.

tinchodias commented 10 months ago

What do you think about this? :

Fuel has always assumed that the object/graph the user passed as argument to serialize doesn't change. It's a kind of precondition of the API and design.

Let's forget of ephemerons for a second. There is no guaranty from Fuel that the serialization will be fine if another Process interrupts Fuel and changes the graph. Now, the case of ephemerons is kind of included in the previous sentence: Fuel serialization is interrupted and the graph is changed.

But okay this is different... ephemerons are a core feature, and can deserve an exceptional treatment.

I wonder if there is a way to have your option 2, but ensure no ephemeron mourning is done during execution. Maybe valueUninterruptably? with something like that the user could avoid the problem (not a Fuel issue).

theseion commented 10 months ago

I was thinking along similar lines, e.g., iterating over all ephemerons and creating strong pointers before starting serialization. But that would have to happen for every serialization, so that's not an option. I'm afraid to use valueUninterruptably, it depends on VM internals. If the VM decides to run the finalisation process, there's nothing we can do. I wouldn't want to base a core mechanic on that.

Object pinning might be a way, not sure if the VM supports that yet. But even that wouldn't work because at the time we visit the ephemeron, its container has already been analysed and might have changed in the mean time.

TBH, the need for the graph to be stable has cause me a lot of work in the past. It would be awesome if we could take a snapshot of the graph before starting serialization (which is pretty much what ImageSegment does, BTW). But I don't see a simple way to do that.

tinchodias commented 10 months ago

I didn't take into account pinObject, don't know about that. I'll try to talk with Pablo about this issue.

BTW, I'm curious about your approaches to tackle unstable graph serialization.

theseion commented 10 months ago

I don't really have "approaches" 😄. If you have points in time where you can guarantee a stable graph, then that's solvable. If not, then it boils down to things like forking the image, stopping all interfering processes in the forked image until the graph becomes stable, then serialize.

I'm curious what Pablo will say. The must have been thinking about these sorts of problems.