beanshell / beanshell

Beanshell scripting language
Apache License 2.0
815 stars 183 forks source link

Namespaces can't be properly chained #676

Closed nickl- closed 1 year ago

nickl- commented 1 year ago

Posted on Aug 21, 2012 by Happy Ox

It turns out that BeanShell2 doesn't allow me to create arbitrary namespace chains. I'm using multiple Interpreter instances, but I can't for instance construct namespace chain with 3 nodes. It seems I'm only able to create a local/function and a global namespace and link them together, but I need to create three chained namespaces: A global namespace, a "per image" namespace and the "function" namespace. The API seems to allow arbitrarily chained namespaces (I can write the code I need) but when investigating the resulting namespace chain, it has been rechained to two levels only. Are you able to fix this? It would open up the ability to significantly accelerate album builds in jAlbum as I will then be able to execute multiple scripts simultaneously.

I've been using the most current version of BeanShell2 on Mac OSX (Java 1.6)

Comment 1 Posted on Aug 31, 2012 by Happy Ox

Can I have feedback on this please. Comment 2 Posted on Aug 31, 2012 by Grumpy Monkey

Could you provide your test/example code? Comment 3 Posted on Sep 1, 2012 by Happy Ox

Sure. The attached beanshell script will demonstrate the error. It first sets up a convenience function that creates "child" interpreters that have a namespace who's parent is the namespace of the passed in argument.

The example then sets up two interpreters, a "root" and a "child" interpreter.

Then a global "bar=42;" is evaluated with the root interpreter and a local "bar=4711;" is evaluated with the child interpreter

Then, a foo() function is declared within using the root interpreter. This function simply prints bar and also prints the name of the current and parent namespace

Finally, foo() is called both via the root interpreter and via the child interpreter. When called via the root interpreter, foo() correctly prints 42, but when called via the child interpreter it should print 4711, but actually prints 42. The following debug printout explains the mystery, the function's namespace has been parented to the root instead of being parented with the namespace where it was called (child's namespace).

I simply wish for the namespaces of functions to be parented with the namespace where they are called from instead of being hard coded to be parented to the root's namespace.

With this change, I will be able to have multithreaded script execution and speed up our jAlbum application considerably, where each thread has its own interpreter and local scope for image specific variables and the ability to call functions, originally declared via the root interpreter, but with the ability to refer to image specific variables in the local scope. I hope what I'm saying makes sense. In my sample code we should end up with a 3 level deep namespace chain: The root namespace, the image specific namespace and the method/function namespace. Compare to the "method", "class" and "global" namespace levels when doing OO in BeanShell - it's not a unique concept at all and BeanShell's namespaces allow for arbitrary depths already so I think this just requires a minimal change to the code that sets up the namespace for functions

Regards /David Attachments beanshell method parent namespace error reproduction.bsh.txt 1.14KB

Comment 4 Posted on Sep 7, 2012 by Happy Ox

Any comments on this? Comment 5 Posted on Sep 9, 2012 by Grumpy Monkey

I'm not soure your expectation is correct. Do you see a regression from original beanshell? If so i didn't spot the difference..

If it's not a regression: Could you explain your use-case in more detail? Comment 6 Posted on Sep 10, 2012 by Happy Ox

No, BeanShell2 behaves just like classic BeanShell. I just wish to see a change in how the parent namespace of methods is picked so it uses the namespace of the caller instead of the namespace where the method is declared. That would make BeanShell a truly dynamic late-binding interpreted language. Now it's a mix of late and early binding (namespaces are early-bound while variables are late-bound) Comment 7 Posted on Sep 10, 2012 by Grumpy Monkey

I can't spot the "mix". The method you define sees the variables of the namespace it's defined in (and the variables of parent namespaces of course). If you want global defined methods operating on locally defined variables you should pass these variables as arguments.

Perhaps you could find an alternative solution to your problem when looking at the source of provided commands which fiddle with namespaces (i.e. "bind.bsh", "eval.bsh", or "extend.bsh")?

Anyway, altering the way binding works would break existing beanshell code. So I change the status to 'invalid' for now. If there are any new arguments I'm happy to change this, so don't be discouraged adding further comments. Comment 8 Posted on Sep 10, 2012 by Happy Ox

We have 10 years of existing 3:rd party BeanShell code in jAlbum skins that are made by our open source community. Many of these skins refer to image specific variables within functions (i.e. not passed as arguments). These skins would break if we try to implement multi threaded script processing using multiple interpreter objects due to the current design of BeanShell (which is the classic design for a statically bound language). It works today with one interpreter as we simulate image specific namespaces by adding and removing image specific variables to the one single global namespace as we iterate through images in one thread. Apart from being a rather ugly approach, it also limits us to using one interpreter object as it would create a mess to add and remove variables from multiple threads in one single interpreter's namespace.

If we started with a clean slate, the design could be made different. One could then declare support functions that need to refer to image specific variables in a certain way, but we don't have the luxury to rewrite skins produced over a 10 year span just to enjoy multi threaded script processing. However, with this proposed change to BeanShell that would make it fully dynamic, we can keep compatibility with legacy code AND enjoy multi threaded script processing.

(Passing local variables as parameters can be a hazzle as these support functions usually refer to A LOT of image specific variables - usually camera specific EXIF data)

Originally posted on code.google.com issue 74

nickl- commented 1 year ago

History of unit test:

Unit test originally added at code.google.com as Namespace_chaining.java

Later renamed to Namespace_Chaining_Test.java at https://github.com/pejobo/beanshell2/commit/9e6ed9e

Refactored using asserts instead of print output and marked as a known issue.

Finally resolved with commit message Fix override child namespace for namespace chaining adding additional asserts.

The fix introduced an issue with properly scoping This returned from function. see #659

Fixes so far has only attempted to revert the change resulting in breaking this use case and unit tests, resolved by munging the test parameters to fit the re-implementation.

I believe we can fix #659 without having to break this use case and that the impact, albeit unfortunate and unforeseen, does not inherently render these use cases incompatible. We can have our cake and eat it...

nickl- commented 1 year ago

Whether this is a valid use case and if it should be supported by BeanShell.

From original maintainer:

The method you define sees the variables of the namespace it's defined in (and the variables of parent namespaces of course).

Attempting to find a JAVA example I originally likened this to inheritance and hiding of fields:

jshell> class A {
...>     int bar = 42;
...>     int foo() {
...>         return bar;
...>     }
...> }
|  created class A

jshell> class B extends A {
...>     int bar = 4711;
...> }
|  created class B

jshell> class C extends B {
...>     int bar = 5;
...> }
|  created class C

jshell> (new C()).bar;
$78 ==> 5

jshell> ((B)new C()).bar;
$79 ==> 4711

jshell> ((A)new C()).bar;
$80 ==> 42

jshell> (new C()).foo();
$66 ==> 42

jshell> ((B)new C()).foo();
$67 ==> 42

jshell> ((A)new C()).foo();
$68 ==> 42

This is strange behaviour to say the least and no wonder the JAVA tutorial on inheritance does not recommend this.

You can declare a field in the subclass with the same name as the one in the superclass, thus hiding it (not recommended).

The JAVA language specification does not entirely explain this behaviour.

If the class declares a field with a certain name, then the declaration of that field is said to hide any and all accessible declarations of fields with the same name in superclasses, and superinterfaces of the class.

Perhaps it is explained by the following statement from the same section in the specification:

A class inherits from its direct superclass and direct superinterfaces all the non-private fields of the superclass and superinterfaces that are both accessible to code in the class and not hidden by a declaration in the class.

Which doesn't quite explain how the following works:

jshell> class B extends A {
...>     int bar = 4711;
...>     int foo() {
...>         return bar;
...>     }
...> }
|  replaced class B

jshell> (new C()).bar;
$78 ==> 5

jshell> ((B)new C()).bar;
$79 ==> 4711

jshell> ((A)new C()).bar;
$80 ==> 42

jshell> (new C()).foo();
$70 ==> 4711

jshell> ((B)new C()).foo();
$71 ==> 4711

jshell> ((A)new C()).foo();
$72 ==> 4711

Either way, the hidden fields JAVA compatibility implementation remains an open bug in BeanShell #358

nickl- commented 1 year ago

Whether this is a valid use case and if it should be supported by BeanShell. (continued)

Attempting to find a JAVA example I originally likened this to inheritance and hiding of fields:

But this is not class inheritance so the comparison is not equal and there is really no comparison in java:

If the following is true:

int bar = 42;
{
   {
      int bar = 4711;
      assert(bar == 4711);
   }
   assert(bar == 42);
}

Then it begs to reason that the same would be applicable to:

int bar = 42;
int getBar() { return bar; }
{
   {
      int bar = 4711;
      assert(getBar() == 4711);
   }
   assert(getBar() == 42);
}

From original maintainer:

Anyway, altering the way binding works would break existing beanshell code.

This has a very specific intention, re-declaring a variable not just assigning a value to an existing variable, and in the same way that the first example already works, I can see no reason why it should not work the same for methods. If it breaks any existing code then what exactly was the intention for re-declaring a variable with the same name in the first place? I fail to see the use case...

davidekholm commented 1 year ago

We'd still appreciate if namespace chaining was implemented. It is a sort of late binding approach where the variables visible in the caller's scope (not local variables, but not top-level global variables either) are also made visible to the called method.

opeongo commented 1 year ago

Your experiments showed that Java resolves variables to the ones that were found when the method was compiled:

jshell> (new C()).foo();
$66 ==> 42

jshell> ((B)new C()).foo();
$67 ==> 42

jshell> ((A)new C()).foo();
$68 ==> 42

and

jshell> (new C()).foo();
$70 ==> 4711

jshell> ((B)new C()).foo();
$71 ==> 4711

jshell> ((A)new C()).foo();
$72 ==> 4711

This surprised me at first glance. I thought that bar() would return the local value of foo. But then after thinking about it some more it makes perfect sense for Java.

When B was compiled the variable in the bar in the B namespace was bound within the foo function. When C was compiled another foo was added and hid the previous definition. B did not change as a result of compiling C, it still kept it's binding. When you call foo() on C the binding of bar does not get reevaluated. This kind of rebinding would be expensive, and where would this rewiring take place? B doesn't know about C, and C cannot rewire B.

Secondly, casting will let you access different hidden fields, but it will not allow you to access overridden methods. The author of B has chosen to override bar(), and there is no way to get an end run around this. No matter how you cast the object, methods will be selected by starting with the specialized object and moving up the inheritance chain through supers until the method is found.

Both of these features seem like essential Java for not breaking encapsulation. So after some thought, yeah, expected behaviour.

opeongo commented 1 year ago

Fixes so far has only attempted to revert the change resulting in breaking this use case and unit tests, resolved by munging the test parameters to fit the re-implementation.

Please.

I cannot believe that you are stating this after you yourself just showed the evidence that changes that were made was to bring it in line with what Java does. If you want do do variable resolution this different way then fine, but don't call whatever you are creating Java compatible. And maybe pick a different name for the project than BeanShell.

Late binding is not Java. Why do you want to go there?

nickl- commented 1 year ago

Secondly, casting will let you access different hidden fields, but it will not allow you to access overridden methods. The author of B has chosen to override bar(), and there is no way to get an end run around this. No matter how you cast the object, methods will be selected by starting with the specialized object and moving up the inheritance chain through supers until the method is found.

Was following your explanation on the compiled class, which made perfect sense, then you lost me here at "there is no way".

Had a quick look at C# and it does very much the same but will respect you casting to a more specific implementation.

> public class A
. {
.     public int bar = 42;
.     public int foo()
.     {
.         return bar;
.     }
. }
> public class B : A
. {
.     public int bar = 4711;
.     public int foo()
.     {
.         return bar;
.     } 
. }
> public class C : B
. {
.     public int bar = 5;
. }
> (new C()).bar
5
> ((B)new C()).bar
4711
> ((A)new C()).bar
42
> (new C()).foo()
4711
> ((B)new C()).foo()
4711
> ((A)new C()).foo()
42

In any way, like it or not, hidden fields are still broken, and needs attention see #358

nickl- commented 1 year ago

We'd still appreciate if namespace chaining was implemented. It is a sort of late binding approach where the variables visible in the caller's scope (not local variables, but not top-level global variables either) are also made visible to the called method.

Thank you for your valuable feedback @davidekholm

It is not really late binding either but I am not going to split hairs, as long as we all understand what is to be expected. This is more a feature of dynamic scope, in that variables can be pushed onto the stack after the method's declaration. As a dynamic scripting language we have the ability to evaluate the available scope and make the most of it. Something we are powerless to in the static regimes.

nickl- commented 1 year ago

Late binding is not Java. Why do you want to go there?

Ah lets see, how do I love thee?...let me count the ways:

  1. This is for closures not for classes
  2. JAVA knows nothing about closures either so we have a little wiggle room
  3. There are existing unit tests passing
  4. Community requested feature
  5. Has tenure already with more than 10 years on the books
  6. It is already implemented for Preparsed Scripts
  7. Considered a feature of dynamic languages
  8. It is intuitive and what you expect it to produce
  9. Will not interfere with existing code, if anyone does write hidden variables not accessible by methods... I rest
  10. Not a new feature already working, given there are unforeseen side effects to iron out
  11. Working feature that needs to be removed, do we remove it for preparsed scripts too?
  12. Will break passing unit tests if removed - munging passing tests is diabolical
  13. It has an intentional purpose clearly understood in code by humans
  14. It is the opposite philosophy to "there is no way" yes there is a way
  15. It lets you accomplish what you need without getting in your way
  16. Has no drawbacks, will work the other way too, just don't redeclare variables
  17. Useful especially because JAVA can't do it
  18. Would be considered missing if not a BeanShell feature
  19. Because it is freaken awesome
  20. Because we can
  21. Why wouldn't we?
opeongo commented 1 year ago

Was following your explanation on the compiled class, which made perfect sense, then you lost me here at "there is no way".

My understanding for Java is that if a class overrides a method from a super class then there is no way that a user of an object of that class to directly access the overridden methods. Just like you have shown previously,

((A)new C()).bar()

will execute B.bar(). I meant that there is no way for a user of the java object new C() to access A.bar().

Of course, the author of class B can access the super class method. But not even the author of class C can access A.bar() directly.

Had a quick look at C# and it does very much the same but will respect you casting to a more specific implementation.

I don't know C# so I cannot really comment, other than to say this is surprising to me:

> (new C()).foo()
4711
> ((B)new C()).foo()
4711
> ((A)new C()).foo()
42

It appears that this is a different design choice than java. What if as a library author I have chosen to put some restrictions in B.foo() to guard against misuse. Can anyone just avoid the restrictions and call the base method? Seems insecure.

Can you imagine being able to bypass the delete() method on a B object by calling ((A)obj).delete()?

public class A {
  public boolean delete(File file) {
    return file.delete();
  }
}

public class B extends A {
  @Override
  public boolean delete(File file) {
    if (file.getPath().startsWith("/etc/"))
     return false;
    return super.delete(file);
  }
}

Maybe I could put access restrictions on A, but what if that came from a different library that I have no control over?

nickl- commented 1 year ago

Can you imagine being able to bypass the delete() method on a B object by calling ((A)obj).delete()?

It is not clear if C# lets you call the method on A or if it only exposes field bar on A. Will need more testing.

nickl- commented 1 year ago

I don't know C# so I cannot really comment, other than to say this is surprising to me:

Been doing a lot of C# the last 2 years, it has really grown up and has some very useful syntactical sugar. Several things I am missing in JAVA already which hasn't really been keeping up with the times. Some things are really past due already, like null coalescing for example, even PHP has it implemented.

if (null != bar) { bar.isNotNull(); }

Is so 2021 =)

nickl- commented 1 year ago

It is not clear if C# lets you call the method on A or if it only exposes field bar on A. Will need more testing.

It is actually calling method on A via the cast:

> public class A
. {
.     public int bar = 42;
.     public int foo()
.     {
.         return bar * -1;
.     }
. }
> public class B : A
. {
.     public int bar = 4711;
.     public int foo()
.     {
.         return bar;
.     }
. }
> public class C : B
. {
.     public int bar = 5;
. }
> (new C()).foo()
4711
> ((A)new C()).foo()
-42
> 
opeongo commented 1 year ago

Ah lets see, how do I love thee?...let me count the ways:

Ok, getting a bit silly here, but I guess I will respond.

1. This is for closures not for classes

Agreed

2. JAVA knows nothing about closures either so we have a little wiggle room

Java knows something about closures. I have been referring to anonymous classes as closures (specifically lexically scoped closures), as do others. The interesting comment that I read about comparing anonymous classes and closures was : "Identical, but mostly different". So it depends on what you want to use them for I suppose.

3. There are existing unit tests passing
5. Has tenure already with more than 10 years on the books
6. It is already implemented for Preparsed Scripts
10. Not a new feature already working, given there are unforeseen side effects to iron out
11. Working feature that needs to be removed, do we remove it for preparsed scripts too?
12. Will break passing unit tests if removed - munging passing tests is diabolical

Let me provide a choice quote about that "I really don't care about how it was working in a beta, only how it should be working and how JAVA works." A lot of stuff was added to BeanShell in the post-PatN days that wasn't worth keeping. The fact that this "feature" has been around for a while means nothing, it should be decided on the merits.

4. Community requested feature

This is a dumb ask. Tell them to go use a language that fully supports closures, there are plenty out there. Or if they are going to use BeanShell then play to it's strengths.

7. Considered a feature of dynamic languages

So? What next, goto statements because procedural languages have them? Use whitespace as syntax like python?

9. Will not interfere with existing code, if anyone does write hidden variables not accessible by methods... I rest

677 interferes with my existing code so I notice it. And I am not using hidden variables, so you have that part wrong. It is that anonymous classes (and scripted objects) are lexically scoped. When you change the namespace chaining the way you do the lexically scoped variable lookups do not work any more. So that breaks compatibility with Java code.

8. It is intuitive and what you expect it to produce
13. It has an intentional purpose clearly understood in code by humans
14. It is the opposite philosophy to "there is no way" yes there is a way
15. It lets you accomplish what you need without getting in your way

I don't know what you are talking about here. I think I need to see a use case. Why is this so essential for a language that promotes itself as a scripting language for java? But it still doesn't matter, this feature breaks compatibility with java so it should not be included. Please revert.

16. Has no drawbacks, will work the other way too, just don't redeclare variables

The big drawback that I see is that it breaks compatibility with Java anonymous classes, which I happen to use a lot of so it breaks a lot of my legacy code. BeanShell has always been about compatibility with Java as best as it can do. If you make this change I think you should also change the name of the project to NickShell or something that does not confuse it with what BeanShell was about.

Trying to impose upon people to not use code that has hidden variables is a pretty big ask, that's just silly to suggest. What are you going to do, put a warning label on it? How are you going to explain this compatibility breaking feature? But hidden variables are not even the problem, you have it wrong. We have been arguing this for weeks and you still don't get it. The issue is that anonymous classes are lexically scoped, and changing the namespace chaining the way you are to make then locally scoped breaks this.

17. Useful especially because JAVA can't do it
18. Would be considered missing if not a BeanShell feature
19. Because it is freaken awesome
20. Because we can

I know that you are joking around here, but these are not persuasive points. One of the responsibilities of a language designer, at least for a real language and not a toy, is to soberly reflect on the impact and cost of features, and to know when to say no. Just because something can be done does not make it a good reason to do it. We have been arguing about this problem for over four years (#508). It is a bad design for a language that is supposed to be java compatible.

21. Why wouldn't we?

Because it breaks compatibility with Java, and that should be a good enough reason. @nickl- If you are going to respond to anything in this comment, please respond to this point. Are you saying that you are aware that this will break java compatibility, but you are willing to do this because "locally scoped closure freaken awesome"?

nickl- commented 1 year ago

Did not expect a reply to that, but most importantly please keep it civil there is no need to get personal or insulting.

Because it breaks compatibility with Java, and that should be a good enough reason. @nickl- If you are going to respond to anything in this comment, please respond to this point. Are you saying that you are aware that this will break java compatibility, but you are willing to do this because "locally scoped closure freaken awesome"?

Classes are now specifically excluded with namespace.getParent().isClass this should be true for anonymous classes too.

677 interferes with my existing code so I notice it.

This is not helpful, need examples and specifics. All the unit tests you have provided now passes. There might still be more kinks to sort out I won't make any claims that there aren't still bugs. However as for java compatibility breaking I would like to see proof of that please. If we have reproducible faults they can be addressed.

Nothing has changed the goals remain the same, I was wrong to compare this to class inheritance because it is not. Not a loss as we can now simply exclude classes. The hidden fields are not even working for classes anyway #358 and has nothing to do with this patch. Apologies if my mistake caused confusion, but after further investigation into the history and implications, which is why this thread was opened. (will open it again) I came to the conclusion that closures are not class methods and not bound by the same restrictions. I am not convinced that you have considered everything raised here because I have made this distinction already.

But this is not class inheritance so the comparison is not equal and there is really no comparison in java:

Which was also the first point in my list of reasons:

  1. This is for closures not for classes

Which is probably the only point you agreed with.

What I don't understand is why we are still comparing this to class inheritance. Unless I misunderstand your argument...

opeongo commented 1 year ago

I just figured out that my review comments that I wrote yesterday before replying to this comment were not visible since I had not clicked on the 'submit review' button. Those comments will provide some extra context and may help the rest to make sense.

I did provide an example, see here. I also provided an example in the review comments And just now I am providing new unit tests in 45ab8604.

My argument is

anonymous classes [in java] are lexically scoped, and changing the namespace chaining the way you are to make then locally scoped breaks this.

I don't have anything else to add on this. Me repeating this over and over again is not working.

nickl- commented 1 year ago

lexically scoped, vs locally scoped doesn't make any sense to me, the two are not mutually exclusive.

Nor does the implementation break anything, you can still access the original scope lexically in fact we specifically test that the declaring namespace is visible. Unless concrete and working examples are provided that proves this breaks java compatibility as you claim, this issue can be closed as: works as implemented.

nickl- commented 1 year ago

679 proves that this does not break JAVA compatibility, because classes are ignored.

It also doesn't apply to the scripted object equivalent:

doer = null;
 {
     int x = 4;
     internal() {
         doit() {
             System.out.println("doing x="+x);  // Line 6
         }
         return this;
     }
     doer = internal();
 }

doer.doit();  // Line 14
x = 5;
doer.doit();

Since the variable change x = 5 has no lexical scope to the original declaration int x = 4; nor does the method doit() have lexical scope to the new variable x. Late binding is only applicable if the current namespace is a child of declaring namespace, ensuring that both the variable, and method is visible or has lexical scope.

Closed: works as expected.