beanshell / beanshell

Beanshell scripting language
Apache License 2.0
815 stars 183 forks source link

Fix for #655 and #659 #672

Closed opeongo closed 1 year ago

opeongo commented 1 year ago

Here is a patch that works for me for fixing #655 and also #659.

There isn't a lot in this patch, really the significant bits are only two relatively small changes. That plus wrapping try-finally around a few method calls, reformatting the code to not make my eyes bleed (well, you might not like my style any better), fixing a data race issue, and adjusting the tests to the newly expected results.

I expect that these changes will be controversial and probably not acceptable, but I am putting this out here to show you what is working for me. If leads to a productive discussion that's great. If you don't like this approach that's ok too, at least I have shown you something that makes a couple of problems go away.

Fixing scripted object/variable confusion (#659)

The first change I have already discussed and that is in the Name class to sure that invokeMethod does not set overrideChild flag to true. As I discussed elsewhere my analysis leads me to believe (I'm hedging here because I am open to be proven wrong) that method invocations always require that a new namespace be allocated with the declaring namespace as the parent, and this gets swapped on to the top of the callstack. The body of the scripted object will be run using this new namespace. When the methods returns this to the caller it will be backed by this fresh namespace. For that reason the overrideChild flag must always be false. I do not think that this is too controversial.

This change will alter the name resolution semantics for closures back to what they were in BeanShell 2.*. These semantics got changed with de15ce30c. In previous beanshell versions scripted objects resolved names from with the scripted object first and then through the declaring namespace parent. The scope of name resolution is then fixed for all all objects created by the method. de15ce30c changed this so that name resolution was via the top of current call stack. There are two problems here:

  1. since names are set in the namespace on the top of the callstack they are not isolated. One scripted object will clobber the values of the previous object.
  2. since the namespace at the top of the current callstack is used for name and value resolution this means that the scripted objects will potentially lookup different variables and values when called from contexts.

There is nothing illegal about the second point, but it is a significant change from beanshell 2.* and I think it is worth a discussion before breaking backward compatibility. My opinion is that a) I liked the original way much better because it encapsulates values better; and b) I don't like breaking compatibility. This new way opens up too many possibilities for variables to be overridden in unexpected and surprising ways. It's like leaving a loaded gun on the table. If you need to use different values in different contexts then pass them in as parameters, easy-peasy. Don't just automagically absorb them from enclosing namespaces, this new way is too surprising and difficult to reason about. It allows third party objects to change how your objects work by setting local variables that will override the closures. If beanshell were a popular scripting language this would be seen as a security hole not a feature.

So this change is basically a one-liner, plus changing the test cases to reflect that variable resolution is via the declaring namespace, not the top of the current callstack

Disappearing variables (#655)

This problem is that in a long running data processing scripts variables would mysteriously become undefined, when for they previous x00,000 iterations they had worked correctly (read more about this in #655). My analysis of the problem is described below, and the solution I am using is basically a three-liner. My solution works for me, and I think my analysis is correct but I am going to hedge here again and say I am open to being proven wrong.

Class BSHBlock is using the ReferenceCache to keep hold of and recycle NameSpace objects. The current top namespace is the key that is used to access the subordinate namespace's in the cache. Instead of allocating a new NameSpace object as required, and then allowing it to be disposed of by the GC when it isn't used anymore, the ReferenceCache will create a new NameSpace when there is a miss in the cache, and afterwards will recycle the previously initialized object. Every time the NameSpace object is finished being used in BSHBlock some of the internal state of the NameSpace is reinitialized. The cache uses WeakReferences so cache entries can be reaped by the GC, but presumably the hot ones will stay around to a while. This method depends on GC performance patterns that have no guarantees.

I suppose the ReferenceCache is being used to improve upon the allocation/gc costs of using new every time. Do I understand this right?

If I am correct in how I understand this, then the ReferenceCache will return the same subordinate namespace object for a given namespace. If three scripted object are manufactured within the same block, would the cache not return the same subordinate namespace for use in each of them? This would be OK if each block execution were independent, but scripted objects might return this, and if recycling namespaces with the cache then wouldn't all three this object be backed by the same NameSpace object? Please tell me I am wrong about this. Anyway the empirical evidence is that this change made the disappearing variable problem go away.

As I said, my fix is a simple three-liner. Just instantiate a new NameSpace to use instead of getting a recycled one from the cache.

Other changes

  1. Make the processing of enum constants thread safe. Previously a static List was used that potentially could have collisions from multiple threads at the same time.
  2. Format and and add comments to BSHBlock. I wanted to make sure that I understood what was going on, so I spent a bit of time on staring at the code and my OCD took over.
  3. Add a few try-finally wrappers to make sure that added namespaces are removed off the stack if an exception happens.
codecov[bot] commented 1 year ago

Codecov Report

Merging #672 (f2d61b1) into master (f1659c1) will decrease coverage by 0.00%. The diff coverage is 100.00%.

@@             Coverage Diff              @@
##             master     #672      +/-   ##
============================================
- Coverage     71.11%   71.10%   -0.01%     
+ Complexity     2773     2771       -2     
============================================
  Files           106      106              
  Lines          9069     9060       -9     
  Branches       1778     1765      -13     
============================================
- Hits           6449     6442       -7     
- Misses         2205     2206       +1     
+ Partials        415      412       -3     
Flag Coverage Δ
unittests 71.10% <100.00%> (-0.01%) :arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/main/java/bsh/BSHAllocationExpression.java 78.07% <100.00%> (ø)
src/main/java/bsh/BSHBlock.java 100.00% <100.00%> (ø)
src/main/java/bsh/BshMethod.java 72.98% <100.00%> (+0.31%) :arrow_up:
src/main/java/bsh/Name.java 86.52% <100.00%> (+0.53%) :arrow_up:
src/main/java/bsh/BlockNameSpace.java 68.18% <0.00%> (-18.19%) :arrow_down:
src/main/java/bsh/BSHTryStatement.java 92.00% <0.00%> (-2.67%) :arrow_down:
src/main/java/bsh/Interpreter.java 77.22% <0.00%> (-1.31%) :arrow_down:
src/main/java/bsh/BSHType.java 80.59% <0.00%> (-1.10%) :arrow_down:
src/main/java/bsh/NameSpace.java 85.49% <0.00%> (-0.87%) :arrow_down:
...n/java/bsh/classpath/DiscreteFilesClassLoader.java 86.66% <0.00%> (-0.84%) :arrow_down:
... and 7 more

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

nickl- commented 1 year ago

Don't just automagically absorb them from enclosing namespaces, this new way is too surprising and difficult to reason about. It allows third party objects to change how your objects work by setting local variables that will override the closures. If beanshell were a popular scripting language this would be seen as a security hole not a feature.

Not following here, perhaps a code example?

nickl- commented 1 year ago

So this change is basically a one-liner, plus changing the test cases to reflect that variable resolution is via the declaring namespace, not the top of the current callstack

We can use Assumes or similar to mark known problem tests to be skipped instead of breaking the test results to fit the solution.

nickl- commented 1 year ago

I can completely agree with scripted_object.bsh

test1(String c) {
   return this;
}

container() {
   ylds = new ArrayList();
   return this;
}

Object c = null;
{
   {
      c = container();
   }
   c.ylds.add(test1("hello"));
}
assert("hello".equals(c.ylds.get(0).c));
Object d = null;
{
   {
      d = container();
   }
   d.ylds.add(test1("goodbye"));
}
assert(c.ylds.size() == 1);
assert("hello".equals(c.ylds.get(0).c));
assert(d.ylds.size() == 1);
assert("goodbye".equals(d.ylds.get(0).c));

But will this also still be true?

test1(String c) {
   return this;
}

container() {
   ylds = new ArrayList();
   return this;
}

Object c = null;
{
   {
      Object c = container();
      c.ylds.add(test1("hello"));
      assert("hello".equals(c.ylds.get(0).c));
   }
   assert(c == null);
}

Otherwise something is seriously broken.

nickl- commented 1 year ago

So I am trying to sleep but all I manage to do is dream about namespaces...all over again. =)

I remember why we needed the cache.

Try this (10,000 - 10,000,000):

bsh % _start = System.nanoTime(); for (i=0;i++ < 10000;) {} print(((System.nanoTime() - _start) / 1000000)+"ms");
--> $4 = 87744210186405L :long
 3ms
--> void
bsh % _start = System.nanoTime(); for (i=0;i++ < 100000;) {} print(((System.nanoTime() - _start) / 1000000)+"ms");
--> $5 = 87762536468325L :long
 14ms
--> void
bsh % _start = System.nanoTime(); for (i=0;i++ < 1000000;) {} print(((System.nanoTime() - _start) / 1000000)+"ms");
--> $6 = 87768869490347L :long
 164ms
--> void
bsh % _start = System.nanoTime(); for (i=0;i++ < 10000000;) {} print(((System.nanoTime() - _start) / 1000000)+"ms");
--> $7 = 87776434451242L :long
 1421ms
--> void

Still nothing to brag about, this is more like it:

jshell> long  _start = System.nanoTime(); for (long i=0;i++ < 10000000L;) {} print(((System.nanoTime() - _start) / 1000000)+"ms");
_start ==> 88223537891332
328ms

jshell> long  _start = System.nanoTime(); for (long i=0;i++ < 100000000L;) {} print(((System.nanoTime() - _start) / 1000000)+"ms");
_start ==> 88229813804431
328ms

jshell> long  _start = System.nanoTime(); for (long i=0;i++ < 1000000000L;) {} print(((System.nanoTime() - _start) / 1000000)+"ms");
_start ==> 88235785781660
540ms

jshell> long  _start = System.nanoTime(); for (long i=0;i++ < 10000000000L;) {} print(((System.nanoTime() - _start) / 1000000)+"ms");
_start ==> 88242406316349
2515ms
nickl- commented 1 year ago

So the cache serves a purpose, and there should be no reason that we can't recreate the namespace (especially if we are doing it everytime) when the cache has expired.

The whole idea behind reference cache was that it should be able to recreate the item when requested and found to have expired. It works for the class cache, I have tested it. I have a suspicion the blocknamespace cache was done in a rush and not implemented properly. There is no reason it shouldn't work and the effort is worth it considering the benefits.

nickl- commented 1 year ago

Try this (10,000 - 10,000,000):

Oh yes and this is with 128Gibs of RAM, wait till GC starts interfering... but you'll agree.

opeongo commented 1 year ago

Don't just automagically absorb them from enclosing namespaces, this new way is too surprising and difficult to reason about. It allows third party objects to change how your objects work by setting local variables that will override the closures. If beanshell were a popular scripting language this would be seen as a security hole not a feature.

Not following here, perhaps a code example?

Sure, let me try.

Here is a java example:

public class ClosureExample {

    final int x = 4;

    class Internal {
        void doit() {
            System.out.println("doing x="+x);
        }
    }

    Internal getInternal() {
        return new Internal();
    }

    public static void main(String[] args) {

        ClosureExample cl = new ClosureExample();

        Internal doer = cl.getInternal();
        int x = 5;
        doer.doit();
        System.err.println("The value that doer sees for x="+cl.x);
    }
}

The printed results as expected are always 4.

Now for a beanshell example using a scripted object:

{
    {
        final int x = 4;
        internal() {
            doit() {
                System.out.println("doing x="+x);
            }
            return this;
        }
        global.doer = internal();
    }
}

x = 5;
doer.doit();
System.err.println("The value that doer see for x="+doer.x);

When using beanshell 3.0 the printed results from the beanshell example are always 5. In Beanshell 2.* the printed results are 4.

Now I admit it is a bit of an apples to oranges comparison because the scripted object approach is just so relaxed and drops all of the ceremony. I could spend more time and construct a more persuasive example but hopefully you can see the issue here.

And I agree that there isn't a legal opinion at stake here. There are multiple options. The variable lookups for scripted objects can be through the declaring namespace, like java does it, and like how beanshell 2.* does it. Or they could be through the namespace at the top of the call stack, like how you have changed it for beanshell 3.0.

But regardless of the opinion on how it should be done this is a change from what was working in the previous beanshell version. And it is different from how java does it, and that is surprising. So I think that this change is going to catch people unaware and will perhaps cause a bit of pain for those who were relying on the original behaviour.

So I have given you my analysis and my reasons for why changing this is a bad idea. You are the one who has changed it, so now it is your turn to give me your motivation to change scripted object variable lookup semantics. What inspired you to make this change? Did anyone ask for this change? What were the factors that you considered when you decided to make this change? Why do you feel this change is a good idea? Have you received any objections to this change, and what has been your response to those objections?

opeongo commented 1 year ago

And I should mention another reason that I don't like the new lookup semantics, which I have already mentioned elsewhere several times. Having variable lookups through the current namespace makes it hard to reason about how the program is going to work, because you have to be aware of every context the the object will be run in. Accidentally reusing the same variable names somewhere up the the namespace chain can cause subtle and hard to find bugs. On the other hand, variable lookups through the declaring name space always work the same way every time, no surprises.

To be kind I guess I could extend my imagination to say that the benefit of looking up variables in the current names space is that you can change behaviour within the scripted object just by changing a local variable, so it saves keystrokes. But I see this as a pretty low value feature. You can already make these programmatic behaviour changes in a safer and more easily reasoned about way by having setters or using parameters on the scripted object methods. The kind of accidental mutability that is being exposed here is just a bad idea. It seems like the mantra these days is "mutability bad", and this change just throws it all open.

So in the balance: pros

cons

opeongo commented 1 year ago

In the above I have stated that you made these changes, but I did not mean that as an accusation. It is just that the git history records that you were the one who committed the changes. If you are not the author of the changes can you let me know what is going on? Perhaps that will help to make a bit more sense of this.

opeongo commented 1 year ago

So this change is basically a one-liner, plus changing the test cases to reflect that variable resolution is via the declaring namespace, not the top of the current callstack

We can use Assumes or similar to mark known problem tests to be skipped instead of breaking the test results to fit the solution.

I did not break the the test results to fit my solution. I posted this PR to help explain the situation as I see it, and for discussion purposes. I gave my reasoning for why I think the results should be as they are, and I stand by that.

opeongo commented 1 year ago

I remember why we needed the cache.

Try this (10,000 - 10,000,000):

jshell> long  _start = System.nanoTime(); for (long i=0;i++ < 10000000000L;) {} print(((System.nanoTime() - _start) / 1000000)+"ms");
_start ==> 88242406316349
2515ms

Ok, so this is about a specific micro benchmark optimization. And that is a pretty good speed up so worth taking a look at. But the particular optimization you have chosen is not applicable everywhere.

What your cache is doing is avoiding the alloc/gc costs and recycling an existing namespace. And in some cases this works (like above), and some cases it fails (like my example). What is different is that in my case when using scripted objects the this reference leaks out and is used elsewhere, so in that case a cached object is not appropriate. So if you are going to use this caching system then you will need to do an escape analysis to determine when a cached namespace object is safe to use and when it is essential that a fresh namespace object is allocated instead.

Secondly, I think maybe (hedging because I could be proven wrong as always) that organizing the cache as a map using weak references to the caller namespace as the keys is unnecessary and is more trouble than worth. The namespaces in the cache are all reinitialized to empty so what does it matter to to try to match them up to their callers? Just keep a collection of recyclabled namespace objects in a synchronized stack. And the stack contains hard references. When escape analysis shows that a recycled namespace can be used then pop one off the stack, use it, and push it back when done with it. Maybe if the stack has gotten too big (say mare than 50 recycled namespace objects?), like after a deeply nested excursions on multiple threads, then just let go of the returned namespace object rather than pushing it. The stack size upper threshold is something that might be self-tunable (keep track of max stack depth within the past few seconds to see what the pattern is. All sorts of things to try). But anyway, using a stack instead of a map would still be sufficient for you to get the performance that you were seeing on your micro benchmark. And it would avoid all of the trouble and complexity of the ReferenceCache.

Anyway, just some thoughts on how to keep the performance you are getting from namespace recycling and also not break scripted objects etc.

nickl- commented 1 year ago

But regardless of the opinion on how it should be done this is a change from what was working in the previous beanshell version. And it is different from how java does it, and that is surprising. So I think that this change is going to catch people unaware and will perhaps cause a bit of pain for those who were relying on the original behaviour.

I really don't care about how it was working in a beta, only how it should be working and how JAVA works. We have identified #659 there is something wrong with methods returning this which needs to me fixed. Already agreed upon so there is no need to keep motivating the fact, it only confuses the issue.

It is broken and will be fixed, finished and done.

nickl- commented 1 year ago

I did not break the the test results to fit my solution. I posted this PR to help explain the situation as I see it, and for discussion purposes. I gave my reasoning for why I think the results should be as they are, and I stand by that.

So you say that the inherited method should return the parent value and not the value at that scope, is this how JAVA works?

As changed in this PR:

https://github.com/beanshell/beanshell/blob/a288337a9ae6ba8329420ca9117fb791f120889f/src/test/java/bsh/Namespace_Chaining_Test.java#L80-L97

More particularly that the current tests are wrong:

https://github.com/beanshell/beanshell/blob/325c0d8887df906439b09735b84559cc13168972/src/test/java/bsh/Namespace_Chaining_Test.java#L70-L90

We'll get back to this...

nickl- commented 1 year ago

Ok, so this is about a specific micro benchmark optimization. And that is a pretty good speed up so worth taking a look at. But the particular optimization you have chosen is not applicable everywhere.

I disagree, instead this is quite a generous comparison with an empty blockspace, which we'll hardly ever have in real life, anything more will only compound the situation.

Yes there is work to be done on the cache and we can look at alternative approaches but the performance benefit outweighs the effort needed to resolve this.

nickl- commented 1 year ago

Otherwise something is seriously broken.

I see you did not comment on any of these tests so I take it they are broken now, can we agree?

nickl- commented 1 year ago

We'll get back to this...

This is how JAVA works, which leads me to think the problem is elsewhere, in the way namespace handles fields.

jshell> class A {
...>     int bar = 42;
...>     int foo() {
...>         return bar;
...>     }
...> }
|  replaced class A

jshell> class B extends A {
...>     int bar = 4711;
...> }
|  replaced class B

jshell> class C extends B {
...>     int bar = 5;
...> }
|  created class C

jshell> (new C()).foo();
$66 ==> 42

jshell> ((B)new C()).foo();
$67 ==> 42

jshell> ((A)new C()).foo();
$68 ==> 42

Additional test to pass:


jshell> class B extends A {
...>     int bar = 4711;
...>     int foo() {
...>         return bar;
...>     }
...> }
|  replaced class B

jshell> (new C()).foo();
$70 ==> 4711

jshell> ((B)new C()).foo();
$71 ==> 4711

jshell> ((A)new C()).foo();
$72 ==> 4711
opeongo commented 1 year ago

Otherwise something is seriously broken.

I see you did not comment on any of these tests so I take it they are broken now, can we agree?

Yes of course this works, it was never in question. I have updated the test case to include this case.

You could have easily checked this yourself by pulling this branch, it would only have taken a few minutes to see.

opeongo commented 1 year ago

So you say that the inherited method should return the parent value and not the value at that scope, is this how JAVA works?

I don't think I ever said that, but if I did I spoke wrongly. I don't disagree with anything that you are saying about inheritance, But you are talking about something quite different that what I am talking about. The issue I am talking about has nothing to do with inheritance. In fact I don't recall ever mentioning inheritance. I am talking about closures. Maybe the some of the confusion was because my previous example wasn't a good one. I will try again with a another example.

What I have been trying to say is that scripted objects are closures. You can read about them here. When you look up variables in a closure you start in the namespace of the current block (of course) and then if a variable is not resolved you then continue the search in the declaring namespace, just like java does.

The problem with 3.0-master is that in a scripted object if the variable is not found in the current block then the search continues in the namespace at the top of the callstack, in other words the local variables at the runtime invocation location and so on up through the parent namespace chain. This is not the same as what happens in Java.

I have reworked my example to show this a bit clearer. First the java part:

public class ClosureExample {

    public interface Internal {
        void doit();
    }

    public static void main(String[] args) {

        Internal doer;

        {
            int x = 4;
            doer = new Internal() {
                    public void doit() {
                        System.out.println("doing x="+x);
                    }
                };
        }
        doer.doit();
        int x = 5;
        doer.doit();
    }
}

and the result of running this is:

beanshell>java ClosureExample
doing x=4
doing x=4

So obviously the locally declared variable x that has a value of 5 is not in the scope of the Internal object, so it has no effect. The x variable with a value of 4 is in the declaring namespace where the object was instantiated is the value that is captured by closure contained in doer.

And have a look at the equivalent scripted object example.

doer = null;
 {
     int x = 4;
     internal() {
         doit() {
             System.out.println("doing x="+x);  // Line 6
         }
         return this;
     }
     doer = internal();
 }

doer.doit();  // Line 14
x = 5;
doer.doit();

This BeanShell scripted object is mostly identical in function to the Java case. The internal object is instantiated in a declaring namespace that contains a variable named x having a value of 4.

When this is run under beanshell 2.*, or 3.0 plus the patches in this PR the result is like java:

beanshell>java -cp target\bsh-3.0.0-SNAPSHOT.jar bsh.Interpreter ClosureExample.bsh
doing x=4
doing x=4

This is because the namespace for this closure starts with the local variables, and then continues to the declaring namespace. The variable named x was not found in the namespace for the internal object, so the search continues in the declaring namespace, and the value of 4 is found.

When this script is run using the code currently in 3.0 master the result is:

Evaluation Error: bsh.EvalError: Sourced file: beanshell\ClosureExample.bsh : illegal use of undefined variable, class, or 'void' literal : at Line: 6 : in file: beanshell\ClosureExample.bsh : ) ;

Called from method: doit : at Line: 14 : in file: beanshell\ClosureExample.bsh : doer .doit ( )

This happens because the doit method fails to find a variable named x starting in the local variables to the method. So it continues to the namespace at the top of the callstack which is the place where doit was called from. There is no variable named x in that context yet, so there is an error.

If the statement at Line 14 is commented out then at Line 15 a variable named x is set to a value of 5 in the local namespace. Rerunning the script yields the following result:

doing x=5

As you can see, this value is coming from the local variables at the invocation site. This is not what java does.

I don't know how else to explain this issue. Other to say again that it is not about inheritance. It is about the namespace search chain for resolving variables in closures.

nickl- commented 1 year ago

Still working on this as time allows...

nickl- commented 1 year ago

I don't know how else to explain this issue. Other to say again that it is not about inheritance. It is about the namespace search chain for resolving variables in closures

I don't think it has anything to do with lexical scope, unless it also means something else. The lexicon or code defines the call stack which is visible from where bindings can be found. I have difficulty trying to understand you as lexical scope remains the same, with or without late binding (or whatever we are calling it)

Perhaps what you are trying to distinguish, and I am grasping at straws so forgive me if I am wrong, is the difference between static and dynamic scope. Java is compiled (as with all compiled languages) and requires that everything is known at the point of declaration, once compiled it is hard coded and in place already attached to the bindings it has discovered, long before execution.

Scripting languages are dynamic they only get parsed at the point of declaration for syntactic correctness, no attempt is made to find bindings and in fact you can declare those afterwards and the block will not be any wiser. In beanshell we strip the page of methods and their code blocks ends up in a map where we can look them up by name.

Only when evaluated do we start looking for bindings, and we use the lexical scope through the call stack and namespace hierarchy (which are lexically defined and populated) to achieve this. We cannot force the scope statically anymore, this will break lexically because the declaration scope might not be visible anymore, nor should we want to because this is a benefit of being a dynamic language only interpreted when executed.

I know you are trying to break it to prove that there is a problem, but also consider this logically. Why would you code something like this for example:

doer.doit(); 
x = 5;
doer.doit();

If you want x = 5 to be ignored? It makes no sense, or am I missing something? Instead I would argue that we should make every effort to incorporate that into the dynamic execution, and if you didn't want x = 5 because you are relying on another reference in the lexical scope then you won't be polluting the scope and risk it being obfuscated, by giving x = 5 a more descriptive and unique name to explain its intent and purpose.

So to summarize: JAVA doesn't use closures but classes, and classes gives us a lexical boundary and practical grouping so we can make every effort to dynamically execute the code as JAVA does statically. However JAVA doesn't have closures so it cannot provide an appropriately defined stipulation for us to follow. In this regard we have long left the station or we will soon be pressed to justify the type system to name one, which has a more defined conformity to abide by in JAVA, but we won't. Because it is about the best of both worlds, productivity and simplicity, being allowed to bend the rules when you want to and enabled when you simply have to because there is no other choice.

I am not unreasonable and if something breaks that shouldn't then we will fix it, as we do all the time. But if a developer wants to only use c for all their variable names then there is going to be limit to how much we care.