beanshell / beanshell

Beanshell scripting language
Apache License 2.0
815 stars 183 forks source link

bsh profiler and restricted bsh to java translator. #719

Open cri-s opened 1 year ago

cri-s commented 1 year ago

I want add the possibility to time/profile bsh and to convert bsh to strict-bsh and from strict-bsh to java. From previous thread bsh-java #501 : Hi, i want implement the following on java8 for now. Convert beanshell to beanshell-strict after running 100% of the code adding type information based on the already interpreted code. Convert beanshell-strict to java in code. It's clear that not all possible code or code that uses a lot eval could not be converted, but other type of code should work. Do you want have included that functionality and if yes, what codebase should i use, bsh-2 or bsh-3 . Actually the productive running machines use bsh-2 and uses a lot of bsh for personalisation. Internally, the code should make a hash of the source code, build a map of possible code blocks that could be skipped and write down a file with that information and at every run add the missing codeblocks or different type codeblocks. with additional timing infos call this file as example .bsp . If 100% of codeblock is completed or new type is added after 100% completion, then additional file .bsc is written down. From this and optionally the source, strict mode could be derived, or directly java code to be compile directly. Timing info is just added that this functions should be useful for performance profiling too. Tell me if yes/no or other things.

nickl- commented 1 year ago

Hey @cri-s thank you for moving the conversation to its own thread.

I want add the possibility to time/profile bsh and to convert bsh to strict-bsh and from strict-bsh to java.

We have the strict-java feature, which I assume we are talking about the same thing.

https://github.com/beanshell/beanshell/blob/ccfb83ba9e4f7e8013285b1d57edd3a2c1c344e7/src/main/java/bsh/Interpreter.java#L1122-L1143

Basically the idea is that if we switch this flag all the BeanShell conveniences get disabled and only pure JAVA will be allowed. Although it will still be interpreted, while running under BeanShell so the same mechanics described under #501 still applies and so will our motto Works like JAVA does, doesn't break like JAVA does but we can certainly make things a lot more difficult =)

We have continuously attempted to incorporate strict java checks throughout the code base, but this has not been tested at all. There are a few existing issues related to strict java, which were marked "out of scope for bsh 3.0" release because we are trying to focus on the existing roadmap to get to a stable release. Even though this is not currently our core focus you are more than welcome to bash out some unit tests so that we can get a better understanding where things are really at.

...what codebase should i use, bsh-2 or bsh-3. Actually the productive running machines use bsh-2 and uses a lot of bsh for personalisation.

BeanShell 2 has only ever been a beta release, and a lot of work over the years has gone into getting things ready for a stable release which will be BeanShell 3.0. The two code bases are now so far removed from each other, after the Apache incubator attempt and merging the two forks, that there remains very little point left to revisit the old code. As is the nature of software, the more you tighten up the code the more catastrophic failures occur exposing further dormant bugs never seen before, I won't sugar coat or lie to you things can still go very wrong before it goes right. But we are committed to making it work and if we can just keep the momentum going with more able hands like yourself getting involved, there is nothing stopping us from bringing a stellar product to all the BeanShell fans. tldr; yes forget about 2.b start porting your scripts to BeanShell 3.0

Internally, the code should make a hash of the source code, build a map of possible code blocks that could be skipped and write down a file with that information and at every run add the missing codeblocks or different type codeblocks. with additional timing infos call this file as example .bsp . If 100% of codeblock is completed or new type is added after 100% completion, then additional file .bsc is written down. From this and optionally the source, strict mode could be derived, or directly java code to be compile directly. Timing info is just added that this functions should be useful for performance profiling too. Tell me if yes/no or other things.

I think you will find that a lot of the building blocks are already there for what you have in mind. The goal of BeanShell after all is to be able to interpret any JAVA source, and if compiled it works too. What I would suggest is start digging through the code, and please put your discoveries into unit test to gauge your progress, as I said before strict java has not been tested at all so there is an open field for you to explore. Once you have a better idea of what we have and what is missing lets sit down and make a plan.

Feel free to shout if you have any questions...

cri-s commented 1 year ago

I nave checked the source code a bit and have 3 question. 1) Javacc can it be used as is or the resulting generated code must be tweaked. As I don't know javacc I ask if the template was tweaked or the resulting code must be tweaked as written in some comments (from memory)

2) It's possible to insert two keywords for strict usage that would simplify some thinks, basically val and var from Lombok project. Val is a final variable that must be immediately assigned. Var is a variable without type, but if assigned the type remain fix and cannot be changed. I think for var, it was introduced on java 11, including using var as parameter. Principally that would simplify my intend and maybe eliminate the profiling stage. After this change using strict with var/val the difference between bsh and strict bsh is that variables using strict cannot change type. Maybe call this relaxed mode as lombok must be included for compiling it at least before java11. Variable boxing/unboxing need still be addressed. Val could be translated directly.

3) it's possible to add a cast to functions, (private)too(bar) , protected,public... in order to select attribute for function that must be present, otherwise error is rised. As special case, if cast to (private) is used, accessibility don't need to be set to true as the user request explicit access to that variable/function.

Message ID: @.***>

nickl- commented 1 year ago

I nave checked the source code a bit and have 3 question.

Did you manage to enable strictJava? This in itself will be evidence to shed light on your questions. The proof is in the pudding....

1) Javacc can it be used as is or the resulting generated code must be tweaked. As I don't know javacc I ask if the template was tweaked...

What are you asking exactly? Are you interested to know about the generated code or the grammar (perhaps what you refer to as template, grammar is the language specification). Because I don't understand what you are asking it will be difficult to give you the answer you want. So instead I will just give you more information about javacc and hope that helps. This will not be in depth but rather some broud strokes to help colour the picture.

Javacc can be broken into 3 parts,

  1. the grammar or language definition

  2. the generated parser

  3. the parsed node tree (or AST)

  4. The BeanShell grammar originally came from the one of the java grammars from the javacc project but as you can see their "templates" can at best define JAVA 1.5 features.

Has the grammar been changed? Well BeanShell 3.0 can parse most JAVA 8 syntax, and you can write code in beanshell (lets just mention a few, expressions like 1 + 1;, loose typed variables outside a class and function/closures) all things that won't even parse in JAVA let alone compile. Yes the grammar has changed but the same grammar can still parse standard JAVA as well.

  1. At the very basics javacc is a parser generator, based on a grammar file it generates source code for a parser capable of parsing syntax that conforms to the grammar, otherwise a parser exception will occur. Yes there are issues with the generated parser and it would be novel if upstream had any intentions of fixing things, but no we don't make changes to the generated source code, that is the whole reason for using a parser generator. Basically everything under target/generated-sources/ remains untouched.

  2. When the generated parser from javacc parses BeanShell scripts it produces an AST, comprising a hierarchy of classes representing the constructs as defined in the grammar. These are the classes following the naming convention prefixed by BSH all caps, they implement Node and extend SimpleNode. Technically these classes were originally generated by javacc but what it generates is nothing more than a skeleton, these nodes are under source control because the code is custom developed and does not get generated by javacc.

BeanShell added an eval method to the Node contract which is the entry point for interpretation, this has nothing to do with javacc anymore. From the parsers perspective the AST represents nothing more than the text extracted from the parsed script, while we may have instances of an if statement, while loop, or method declaration it is only in name and serves to communicate structure of what was defined before and after in a hierarchy. The magic which implements these constructs as language executions may start in the eval methods on a specific node but requires every other line of code and source file in the repository to get accomplished.

So to implement strict java perse it doesn't really have anything to do with the parsing as this is just text, and we still need to be able to parse the non strict sources so making changes here based on a strict flag is pointless. What needs to happen instead is to consciously refrain from implementing the non strict language, and these conditions may apply to the node evaluations but could also be required in other places in source. Which is self evident when doing a search for strictJava to find the current implementations.

2) It's possible to insert two keywords for strict usage that would simplify some thinks, basically val and var from Lombok project.

We want strict BeanShell not strict JAVA, am I right? It is important to keep clear distinction in mind, while BeanShell is written in JAVA, the bean scripts are not. If you declare a variable in script it is just text, until we interpret what that text means and manually create some instance and add it to our stack. Internally all variables are Object there is no other way this is what JAVA dictates, which is why we can be loose about types because unless we implement a type constraint there is none.

Lets take another example: When you make a call on several overloaded methods in a script, it is not java that chooses which method to call, java doesn't know anything about a method call, at all. Internally we have to find the list of overloaded methods, probably through reflection, then we need find the most appropriate method by comparing the passed in value types to the declared parameter types and hopefully come to the same conclusion which JAVA would have. Nothing happens automatically, we have to physically implement every language feature or it simply doesn't work.

So lombok, or some other annotation generated scaffold, isn't going to help us because it only applies to JAVA and that is not what we need. Adding the lombok annotations to scripts will have absolutely no effect, it will just be uninterpreted text and may not even parse with the current grammar. It will require a whole lot of elbow grease at runtime to try and dynamically apply their annotations without completely reinventing the wheel. I'm not even sure JAVA will accommodate us which is probably why annotations are not implemented in BeanShell yet.

No sir there are no short cuts, just like we have to find the correct overloaded method ourselves, any other JAVA language feature you want, we also have to implement ourselves. As for this specific use case it is probably much easier than you thought and a simply if comparison during variable assignment will do the trick. Start playing with strictJava you might just find that this already works as expected.

3) add a cast to functions

I have to apologize because with number 3 you completely lost me. Getting any joy from setting accessibility last worked in JAVA 9 when all it did was complain but still allowed it. Personally I am more inclined to implement everything public internally it makes access through reflection a breeze. The saved hassles far outweighs having to do a simple modifier check to raise an artificial error of no access with little effort.

Probably completely off topic with this last reply so lets leave it there. I hope I managed to give you a better understanding of the problem area. Feel free to put the conversation back on track by clarifying what you aim to accomplish, please.