theseion / Fuel

Fuel, the Smalltalk object serializer
https://theseion.github.io/Fuel
MIT License
26 stars 12 forks source link

FullBlockClosures in Pharo #246

Closed seandenigris closed 2 years ago

seandenigris commented 4 years ago

Now that Pharo has FullBlockClosures, maybe blocks can be handled better in Fuel? It would be nice to minimize the frequent "can't materialize sorted collection" user problem.

At least, maybe:

FullBlockClosure class >> #cleanOuterContext
outerContext := nil.

Perhaps even better, there may not be a reason to keep the outerContext in even more circumstances than non-full-closures. Besides maybe ^-returns, the receiver is stored by the block, and non-block-local variables may be stored also in the CompiledBlock. A problem, however, seems to be that non-block temps may or may not be stored depending on whether there is an outerContext (i.e. they are stored if created in a workspace, but not if returned from a method).

tinchodias commented 4 years ago

Hello @seandenigris, I'm a bit out of sync with Fuel and the details of FullBlockClosure, but... Are you aware of #245? Do you propose something different?

seandenigris commented 4 years ago

I'd like this test to pass:

testCleanBlockClosureChangeDifferentBytecodes
    "Don't raise an error when materializing a clean closure whose (irrelevant) method has changed bytecodes."

    | aClass aClosure |
    aClass := self newClass
        duringTestCompileSilently: 'methodWithClosure  ^ [ 42 ]';
        yourself.
    aClosure := aClass new perform: #methodWithClosure.
    self serialize: aClosure.
    aClass duringTestCompileSilently: 'methodWithClosure  ^ 42'.
    self shouldnt: [ self materialized ] raise: FLMethodChanged

I merged #245 into Pharo-9.0.0+build.545.sha.8ec08511e3d55476e160795ffa9a582ab81dde59 (64 Bit) and it still fails.

I should say that I'm not 100% clear whether this should always be the behavior, or be an option. It makes sense for my use cases, but maybe someone might need to access the outerContext of a materialized block?'

Maybe @tesonep or @marianopeck can weigh in?

theseion commented 3 years ago

Hi @seandenigris

It's been far too long but 3.0.3 solves the 64 bit issues and adds support for FullBlockClosure (and Sista). Thanks for your efforts, they helped @tesonep to lay the ground work for this.

So thanks a lot for your help!

seandenigris commented 3 years ago

Thanks for the bump, @theseion!

The test case I proposed in an earlier comment (quoted below) still fails in P9 with Fuel 3.0.3. My question still stands (and it may be a result of my own ignorance!): of what value is it for Fuel to complain and fail to load a block due to changed method byte codes, in the case where the block in question doesn't seem to be dependent on those byte codes? The penalty is severe - data can be unloadable with no hook to save you...

I'd like this test to pass:

testCleanBlockClosureChangeDifferentBytecodes
  "Don't raise an error when materializing a clean closure whose (irrelevant) method has changed bytecodes."

  | aClass aClosure |
  aClass := self newClass
      duringTestCompileSilently: 'methodWithClosure  ^ [ 42 ]';
      yourself.
  aClosure := aClass new perform: #methodWithClosure.
  self serialize: aClosure.
  aClass duringTestCompileSilently: 'methodWithClosure  ^ 42'.
  self shouldnt: [ self materialized ] raise: FLMethodChanged
theseion commented 3 years ago

That's weird. I've looked at that test and it's green. But apparently, your example isn't identical. I'll take a look. Thanks!

theseion commented 3 years ago

So... there are a few things to talk about here. First: the test actually has the last check reversed and expects FLMethodChange to be raised, that's what I didn't see. Sorry for closing this issue prematurely.

Since 3.0.3 materialization errors are resumable, so you can (if you know what you are doing), ignore that error in your code.

Now to why FLMethodChanged is raised: the default behavior for methods and closures is to not serialize them. We try to determine whether a method or closure is installed and, if it is, serialize only information needed to perform a lookup (see FLGlobalCompiledMethodCluster for reference). Only when such an object is not installed, i.e., it is a #DoIt or a copy of a method or similar, do we serialize the byte codes of the object. In your example only a lookup is performed. Because of this behavior, it is impossible to correctly look up the closure in your test case and that must be treated as an error.

I'm assuming that you were going for a test case in which the closure is actually serialized. In that case you (accidentally :)) have found a bug.

testBlockClosureChangeSameBytecodes
    "Tolerate materializing a closure whose method has changed but not the bytecodes."

    | aClass aClosure materializedClosure |
    aClass := self newClass
        duringTestCompileSilently: 'methodWithClosure  ^ [ 41 ]';
        yourself.
    aClosure := aClass new perform: #methodWithClosure.
    self serialize: aClosure.
    aClass duringTestCompileSilently: 'methodWithClosure  ^ [ 42 ]'.
    self deny: aClosure method isInstalled.
    materializedClosure := self materialized.
    self assert: materializedClosure value equals: 42

This test does serialize the closure and does not raise FLMethodChanged, which is good (this is because the closure byte codes are no longer part of the method with Sista). However, with a slight modification the test does fail:

testBlockClosureChangeSameBytecodes
    "Tolerate materializing a closure whose method has changed but not the bytecodes."

    | aClass aClosure materializedClosure |
    aClass := self newClass
        duringTestCompileSilently: 'methodWithClosure  ^ [ 41 ]';
        yourself.
    aClosure := aClass new perform: #methodWithClosure.
    self serialize: aClosure.
    aClass duringTestCompileSilently: 'methodWithClosure  ^ 42 ' "<--------- modified".
    self deny: aClosure method isInstalled.
    materializedClosure := self materialized.
    self assert: materializedClosure value equals: 42 "<--------- primitive failure"

In this case it's ok to resume FLMethodChanged (the block has been materialized, not looked up) but the test will raise an exception when the closure is evaluated because of an oversight of mine: FullBlockClosure can use its internal CompiledBlock but it only does so if no outerContext is set. To fix this error I will have to discard the outerContext if the byte codes of the method have changed.

That leaves the question: should FLMethodChanged be raised in this last test? I think it should. While the code is perfectly usable the user should be notified that what the source shows may not be what will be executed.

I just realized that it may make sense to distinguish between a warning and an error in these cases. In your original example there is no way to evaluate the closure and I thank that should be an error. In my last example, however, I think it should just be a warning. I think I'll introduce an error with a name like FLLookupFailedError and make that non-resumable.

Does that answer you question? What do you think of my proposal?

@tinchodias any thoughts?

seandenigris commented 3 years ago

Thank you for the thoughtful reply, @theseion!

Forgive my ignorance, but I cannot identify the difference between my example and your two that lead to:

That said, yes, I think it makes perfect sense to throw an error when materialization is not possible, a warning when it's possible but with a caveat, and no notification if it's materialized exactly as one would expect.

theseion commented 3 years ago

You're right, I made a mistake when copying the code...

In my first example FLMethodChanged will not be raised because the method bytecodes haven't changed, so we can assume that the closure at the same literal is the same that we are materializing, and we don't care whether the closure is actually different. So that would actually be an additional case :).

The method I had meant to put there looks something like this (slightly modified to make my point):

testBlockClosureMaterializesClassVariablesCorrectly
    | class closure method |
    class := (self
        newSubclassOf: Object
        instanceVariableNames: ''
        classVariableNames: 'ClassVariableForTesting')
            duringTestCompileSilently: 'methodWithClosure  ^ [ ClassVariableForTesting ]';
            yourself.
    "Make sure we don't use global clusters here, which would simply perform a lookup"
    method := class methodNamed: #methodWithClosure.
    closure := method
        valueWithReceiver: class new
        arguments: #().
    class removeSelectorSilently: #methodWithClosure.
    self deny: method isInstalled.
    self deny: closure compiledBlock isInstalled.

    self serialize: closure.

    self
                 shouldnt: [ self materialized value ]
                 raise: Exception.
        self assert: self materialized value isNil

In this case the closure and method are being fully serialized and materialized so there is no lookup and I can easily evaluate the closure.

In your example, ignoring FLMethodChanged is fine and will work. Consider this change to your example however:

self shouldnt: [ self materialized value ] raise: FLMethodChanged

Now you are saying that you want to evaluate the closure. This is not possible and will raise PrimitiveFailed. Since the closure can't be looked up it can't be evaluated. So, in general, a lookup failure must be treated as an error, but, it may be fine to resume FLMethodChanged if you know that you won't evaluate such a closure.

I'm actually no longer sure that there is a case for throwing away outerContext, that was base on the wrong code I copied.

Did that help? It's really hard to put all of this together in a somewhat concise way...

seandenigris commented 3 years ago

Thanks for bearing with me. This area is outside my scope but I’m learning a bit :)

IIUC:

Scenario 1

testBlockClosureChangeSameBytecodes
    "Tolerate materializing a closure whose method has changed but not the bytecodes."

    | aClass aClosure materializedClosure |
    aClass := self newClass
        duringTestCompileSilently: 'methodWithClosure  ^ [ 41 ]';
        yourself.
    aClosure := aClass new perform: #methodWithClosure.
    self serialize: aClosure.
    aClass duringTestCompileSilently: 'methodWithClosure  ^ [ 42 ]'.
    self deny: aClosure method isInstalled.
    materializedClosure := self materialized.
    self assert: materializedClosure value equals: 42

Scenario 2

testBlockClosureChangeSameBytecodes
    "Tolerate materializing a closure whose method has changed but not the bytecodes."

    | aClass aClosure materializedClosure |
    aClass := self newClass
        duringTestCompileSilently: 'methodWithClosure  ^ [ 41 ]';
        yourself.
    aClosure := aClass new perform: #methodWithClosure.
    self serialize: aClosure.
    aClass duringTestCompileSilently: 'methodWithClosure  ^ 42 ' "<--------- modified".
    self deny: aClosure method isInstalled.
    materializedClosure := self materialized.
    self assert: materializedClosure value equals: 42 "<--------- primitive failure"

Scenario 3

testBlockClosureMaterializesClassVariablesCorrectly
    | class closure method |
    class := (self
        newSubclassOf: Object
        instanceVariableNames: ''
        classVariableNames: 'ClassVariableForTesting')
            duringTestCompileSilently: 'methodWithClosure  ^ [ ClassVariableForTesting ]';
            yourself.
    "Make sure we don't use global clusters here, which would simply perform a lookup"
    method := class methodNamed: #methodWithClosure.
    closure := method
        valueWithReceiver: class new
        arguments: #().
    class removeSelectorSilently: #methodWithClosure.
    self deny: method isInstalled.
    self deny: closure compiledBlock isInstalled.

    self serialize: closure.

    self
                 shouldnt: [ self materialized value ]
                 raise: Exception.
        self assert: self materialized value isNil

p.s. I wasn't sure what:

"Make sure we don't use global clusters here, which would simply perform a lookup"
    method := class methodNamed: #methodWithClosure.

was about because changing to

closure := class new methodWithClosure.

seemed to have the same behavior.

Analysis

If the above is correct, it seems that as long as the outerContext method is changed before serialization - whether removed, bytecodes changed, or literals changed, the block is successfully serialized and rematerialized, presumably because it is actually saved.

In the 3 cases where the outerContext is changed after serialization, I would expect:

  1. A resumable warning that the block behavior might have changed in an unintended way 2 & 3. A resumable error that the block is unusable. Resumable so that one can materialize and then try to fix the block, which is important because IIRC there is no hook to do this during the materialization.
theseion commented 3 years ago

Thanks for bearing with me. This area is outside my scope but I’m learning a bit :)

IIUC:

Scenario 1

testBlockClosureChangeSameBytecodes
  "Tolerate materializing a closure whose method has changed but not the bytecodes."

  | aClass aClosure materializedClosure |
  aClass := self newClass
      duringTestCompileSilently: 'methodWithClosure  ^ [ 41 ]';
      yourself.
  aClosure := aClass new perform: #methodWithClosure.
  self serialize: aClosure.
  aClass duringTestCompileSilently: 'methodWithClosure  ^ [ 42 ]'.
  self deny: aClosure method isInstalled.
  materializedClosure := self materialized.
  self assert: materializedClosure value equals: 42
  • Fuel materializes without warning or error because the method bytecodes haven’t changed, but the block will no longer necessarily do what it did before it was serialized. Namely, it used to return 41 but now returns 42.
  • However, if the test is changed so that the method is changed before serialization, the block is serialized exactly as it was i.e. still returns 41

Yes.

Scenario 2

testBlockClosureChangeSameBytecodes
  "Tolerate materializing a closure whose method has changed but not the bytecodes."

  | aClass aClosure materializedClosure |
  aClass := self newClass
      duringTestCompileSilently: 'methodWithClosure  ^ [ 41 ]';
      yourself.
  aClosure := aClass new perform: #methodWithClosure.
  self serialize: aClosure.
  aClass duringTestCompileSilently: 'methodWithClosure  ^ 42 ' "<--------- modified".
  self deny: aClosure method isInstalled.
  materializedClosure := self materialized.
  self assert: materializedClosure value equals: 42 "<--------- primitive failure"
  • Fuel materializes with a method-changed notification because the method bytecodes have changed. One can proceed, but the block is invalid because its bytecodes no longer exist (i.e. were not saved and are no longer in the method). #value causes a primitive error
  • However, if the method is changed before serialization, the block is serialized exactly as it was and no notification is made

Yes.

Scenario 3

testBlockClosureMaterializesClassVariablesCorrectly
  | class closure method |
  class := (self
      newSubclassOf: Object
      instanceVariableNames: ''
      classVariableNames: 'ClassVariableForTesting')
          duringTestCompileSilently: 'methodWithClosure  ^ [ ClassVariableForTesting ]';
          yourself.
  "Make sure we don't use global clusters here, which would simply perform a lookup"
  method := class methodNamed: #methodWithClosure.
  closure := method
      valueWithReceiver: class new
      arguments: #().
  class removeSelectorSilently: #methodWithClosure.
  self deny: method isInstalled.
  self deny: closure compiledBlock isInstalled.

  self serialize: closure.

  self
                 shouldnt: [ self materialized value ]
                 raise: Exception.
        self assert: self materialized value isNil
  • works fine because the method was removed before serialization, so the block serialized its bytecodes.
  • If the test is changed so that the method is removed after serialization, the change error is signaled, and if resumed, attempting to evaluate the block gives a cryptic error (“Instance of MorphicUIManager did not understand #bytecodesHash”)

Also yes :)

p.s. I wasn't sure what:

"Make sure we don't use global clusters here, which would simply perform a lookup"
  method := class methodNamed: #methodWithClosure.

was about because changing to

closure := class new methodWithClosure.

seemed to have the same behavior.

Correct. I just wanted a reference to the method, and since I had that, I used it. But I could have achieved the same thing by sending the message.

Analysis

If the above is correct, it seems that as long as the outerContext method is changed before serialization - whether removed, bytecodes changed, or literals changed, the block is successfully serialized and rematerialized, presumably because it is actually saved.

Yes.

In the 3 cases where the outerContext is changed after serialization, I would expect:

  1. A resumable warning that the block behavior might have changed in an unintended way

That makes sense. For this we need to compare the byte codes of the CompiledBlock.

2 & 3. A resumable error that the block is unusable. Resumable so that one can materialize and then try to fix the block, which is important because IIRC there is no hook to do this during the materialization.

That is what we have now (the error names aside).

I'm working on bringing the metalevel package into core right now, because I need class serialization for anonymous classes. With that change it will be possible to configure whether you want to serialize a method or not, overriding the default. That may also help with these kinds of issues.

Thanks a lot for your feedback!

seandenigris commented 3 years ago

it will be possible to configure whether you want to serialize a method or not, overriding the default. That may also help with these kinds of issues.

Ah! Cool. I contemplated an annotation like <fuelShouldSerialize: true> but wasn’t sure if I was getting too complicated or if that would fit with Fuel’s design. Configuration seems like the way to go because, as @svenvc has said:

BlockClosures are way too general and powerful to be serialised”

I would just add “...in a general way” because there are contexts where it is safe and appropriate.

While we’re on the subject of safety, regarding Block serialization in STON, Sven did also say:

using the compiler can be dangerous. It also adds a dependency on source code and the compiler.

I assume that’s not relevant here because we are saving bytecodes and not stringifying, right?

Thanks a lot for your feedback!

Of course. Thanks for your expertise and hard work :)

theseion commented 3 years ago

Ah! Cool. I contemplated an annotation like but wasn’t sure if I was getting too complicated or if that would fit with Fuel’s design.

That would be a possibility and you could use pragmas to collect the methods you want to serialize and pass them to the configuration. I wouldn't introduce it into Fuel though because it adds a very hard dependency. I can't start messing with those pragmas after introducing them.

BlockClosures are way too general and powerful to be serialised”

I would just add “...in a general way” because there are contexts where it is safe and appropriate.

I also agree. I have been thinking about serializing methods as source code though, which would remove a large part of the dependency on the internals.

using the compiler can be dangerous. It also adds a dependency on source code and the compiler.

I assume that’s not relevant here because we are saving bytecodes and not stringifying, right?

That's true. But it does at a dependency on the object layout and private API. Strings are way simpler to handle and make everything less complex (in my opinion, different compilers must be similar enough to compile things that are similar enough; e.g. Opal and V3Encoder are too different compilers but apart from special cases, the produce things that are similar enough).

svenvc commented 3 years ago

A first step would be to handle clean blocks (see #isClean and CleanBlockClosure). This would already cover things like [ :x | x + 1 ] or [ :x :y | x <= y ] a quite common case.

But the compiler risk remains: how are you going to filter out Smalltalk quitPrimitive or any other dangerous stuff ?

theseion commented 3 years ago

@svenvc are you referring to Fuel? We can serialize all methods and closures. There's a step, where we clean them too.

You're right of course, it's unsafe. But I'm not suggesting to use Fuel to share state with others To me it's more a development tool, or something used internally. And it's complicated enough that you need to know what you are doing, if you have to configure it :)

svenvc commented 3 years ago

Ah yes, of course, I forgot about that, of course Fuel can already do that. Thanks.

seandenigris commented 2 years ago

2 & 3. A resumable error that the block is unusable. Resumable so that one can materialize and then try to fix the block, which is important because IIRC there is no hook to do this during the materialization.

That is what we have now (the error names aside).

I'm working on bringing the metalevel package into core right now, because I need class serialization for anonymous classes. With that change it will be possible to configure whether you want to serialize a method or not, overriding the default. That may also help with these kinds of issues.

Hi @theseion, just checking in on this. Not super urgent, but has been a pesky long-term TODO dangling :)

theseion commented 2 years ago

I haven't addressed this specifically but it should already work (aside from proper exceptions). FLMethodChanged is now resumable, so you can wrap materialization in an exception handler. In addition, you could use FLSerializer>>fullySerializeMethod: to make sure you materialize the original version of the method.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will remain open but will probably not come into focus. If you still think this should receive some attention, leave a comment. Thank you for your contributions.

theseion commented 2 years ago

bump

theseion commented 2 years ago

bump