Ideas for cooperation between JavaParser and Spoon contributors

pvojtechovsky commented 6 years ago

I like utopic ideas ... Would not it be nice to not waste our free time for implementing things twice ? ... and to share the effort on similar things somehow?

pvojtechovsky commented 6 years ago

I had a look at JavaParser source code. I often had to check, which project I am looking at, is it really JP? Or am I by mistake in Spoon?

... Déjà vu ...

JavaParser has very similar structures, problems, algorithms ... it is nearly same ... well nearly ... like "cat" and "dog" :-))) ... it would be really a big challenge to try a kittenpuppy.

... these metaprogramming technologies we are playing with are somehow similar to genetics, they are also dreaming about kittenpuppy ... :-)

pvojtechovsky commented 6 years ago

Federico wrote: Who knows, perhaps one day we could find some way of collaborating by sharing ideas, use cases or even code.

@ftomassetti, this issue might be an input point for deeper cooperation ... if we found something interesting for both sides.

Actually I have this first idea. From Spoon point of view the JavaParser is parser and resolver same like Eclipse JDK compiler, whose parser and resolver we are using. Spoon doesn't need compiler. JDK compiler is powerful and gives us what we need for free. On other side it is quite ugly, more heavy then we need, buggy at places, which are relevant for us, but not relevant for compiling (e.g. source positions) and it is not documented :-( May be Spoon might use JavaParser's syntactic AST to generate Spoon's semantic AST. Spoon model building might be faster then and clients of JavaParser would get access to advanced spoon features, which are based on the semantic AST. I know also several reasons why not to do it, ... so we will see whether this idea grow or die. I personally do not care much, because performance is not critical in my use case. But if the result would be joining of the Spoon and JavaParser community and future sharing of effort then I would gladly support that process.

msteinbeck commented 6 years ago

@pvojtechovsky Do you think incremental AST generation also benefits from this approach? I heavily rely on this feature as I'm using this to analyze the evolution of large size software systems. Currently, incremental AST generation utilizes a compiler's classpath to recompile only changed files. If we integrate JavaParser, we may no longer harness this feature.

pvojtechovsky commented 6 years ago

Yes, there are several reasons why JDK compiler will be always better then JavaParser, but there are also reasons why JavaParser will be always better then JDK compiler ;-)

incremental AST generation

why do you need it? Because building of model is slow with JDK compiler. But if it would be fast with JavaParser, then do you still need incremental AST generation?

msteinbeck commented 6 years ago

why do you need it? Because building of model is slow with JDK compiler. But if it would be fast with JavaParser, then do you still need incremental AST generation?

We implemented our own version of an incremental launcher which uses the information of the underlying version control system to update only changed files. Updating the model of, for instance, elasticsearch takes 500 ms on average. Creating an AST with JP takes several seconds for each revision. Usually, we analyze about 4000--5000 revisions of system. Every second counts :). By the way, we will publish this extension in a few days.

pvojtechovsky commented 6 years ago

Very interesting! Thanks! Good luck ;-)

stefanleh commented 6 years ago

A really nice idea. Ive used JavaParser in the beginning of my current project but switched then to Spoon as i needed more reliable resolution of method calls. This part is not so strong in JavaParser. But to me the API for searching an manipulating the tree feels more comfortable. Maybe there can be a best-of-both solution?

pvojtechovsky commented 6 years ago

But to me the API for searching an manipulating the tree feels more comfortable.

@stefanleh Hi Stefan! If I understand it well, the JavaParser has some nicer API in that area comparing to Spoon? Could you create a Spoon issue with a title "suggestion for better API" and put there some code snippets which shows how we might improve APIs in spoon?

ftomassetti commented 6 years ago

I plan to look more into Spoon but for now my understanding is this: Spoon is a tool that wraps the Eclipse compiler and add nice API on top of that for code analysis and code refactoring.

JP instead has its own symbol resolution (i.e., a big chunk of a compiler) but lacks the nice, high-level API on top of it.

JP can be used with or without symbol resolution. Spoon is always using symbol resolution, as far as I understand.

At this stage I am not able to compare the parsing components of Spoon and JP.

From my point of view it would be great if we could share some components. I see three of them:

parsing
symbol resolution
high level API for code analysis and manipulation

By sharing some of these components we could have more heads and hands on the same code base and we could save some energies to spend in the areas that would remain specific to our projects.

I have no idea if this is feasible :) I am just writing down my thoughts. Of course it would be good to hear what our mighty leader @matozoid thinks about this and what ideas he has

ftomassetti commented 6 years ago

A really nice idea. Ive used JavaParser in the beginning of my current project but switched then to Spoon as i needed more reliable resolution of method calls. This part is not so strong in JavaParser.

That part is a source of headaches because it is very, very complex to write. It is way more complex that I thought it would. To give you an idea you could take a look at section 6.5 of the JLS. That part just explain how to figure out what a name represent: is it a package, a type, a variable, a field, etc. It is not about resolving it, just figuring out what sort of name it could possibly be. To implement symbol resolution we need type inference (e.g., for calculating types of parameter values or lambda parameters). You can take a look at Ch. 18 pf the JLS. To get that right is an incredible challenge. I think there are advantages to have each own symbol resolution system (e.g., we can make it incremental, it can work on partial files or incorrect ASTs, etc.) but it is a major undertaking. If we could share the effort we could build something more robust and better maintained.

But to me the API for searching an manipulating the tree feels more comfortable. Maybe there can be a best-of-both solution?

We spend a lot of energy in maintaining the parser itself and updating it to new releases of Java (very frequent, nowadays) so maybe we could offer that. We have also lexical preservation and comments processing built on the AST, which were the two major requests we received an the AST. We also support heterogeneous and homogeneus ways of processing the AST. For example, we have methods to traverse all the nodes, get all the children, etc. but we have also visitors. We have also observers on nodes and subtrees. Anything that was ever asked to process the AST is supported, mature and battle tested. So perhaps that could be something we could share. Also, the pure parser part has its own module in the project.

pvojtechovsky commented 6 years ago

Spoon is always using symbol resolution, as far as I understand.

yes, Eclipse compiler does it.

From my point of view it would be great if we could share some components. I see three of them: 1) parsing 2) symbol resolution

these two components are purely in Eclipse compiler. Spoon does nothing in that area ... if I don't count "parsing" (it is usually no parser but stupid cycle which searches for something) of little fragments at places where we know Eclipse compiler delivers wrong source positions, so we need to found correct one.

3) high level API for code analysis and manipulation

yes, yes, yes. I would love it!!!

Greetings to your mighty leader @matozoid ;-)

That part is a source of headaches because it is very, very complex to write. ... If we could share the effort we could build something more robust and better maintained.

Spoon cannot help here, because all that is done by Eclipse compiler. We just use the results in form of Eclipse compiler AST, where everything is resolved.

Spoon is a tool that wraps the Eclipse compiler and add nice API on top of that for code analysis and code refactoring.

The architecture of Spoon is little bit different. Spoon doesn't wraps Eclipse compiler. Spoon uses Eclipse compiler to let it make Eclipse compiler's AST. Then Spoon transforms this AST to Spoon AST. Then Spoon forgets all Eclipse stuff and works with pure Spoon AST. There is no little small link back to any Eclipse object.

It means that Spoon is quite independent on Eclipse. If somebody ever brings something better or faster then Eclipse compiler what makes usable AST, then Spoon can relatively easily (weeks of work, because java AST is big structure) change few classes and use AST of that new tool to build Spoon AST. After such change there is no need to change anything else and API of Spoon will stay unchanged too.

That brings me to the idea how to share some effort. 1) Spoon community might publish it's Java semantic AST model as small independent library. With 0 dependencies. 2) Spoon community might publish the algorithms based on that model. Again with nearly 0 dependencies

We also support heterogeneous and homogeneus ways of processing the AST. For example, we have methods to traverse all the nodes, get all the children, etc. but we have also visitors. We have also observers on nodes and subtrees. Anything that was ever asked to process the AST is supported, mature and battle tested. So perhaps that could be something we could share.

It would be nice to apply your experience here and to improve such published Java semantic AST model ... Existing Spoon AST model seems to be good for me, but I am sure it can be improved of course :-)

Later ... Java semantic AST model might evolve into some "industry standard" and even more communities might share their effort on java meta-programming algorithms. There is really a lot of things (Data flow models, Control flow models, ... ), which can be done and we have no power for that :-(

Then everybody (e.g. JavaParser community) can try to transform their syntactic AST to Spoon semantic AST. And then everybody can analyze/refactor/pretty print/... that AST using existing algorithms.

WDYT?

ftomassetti commented 6 years ago

It would be interesting sharing some of the higher level API, however to completely change our "semantic AST" to be exactly as the one from Spoon would probably be as costly as rewriting the higher level API ourselves :)

Maybe it would be possible to start looking at some interfaces, like for classes and methods and see how things go.

But I like very much the idea of exposing some sort of semantic APIs and let others build stuff on top of it (refactoring utils, data flow and control flow analysis, code generation/transformation, etc)

pvojtechovsky commented 6 years ago

however to completely change our "semantic AST" to be exactly as the one from Spoon

I understand. It is probably too late to cooperate.

But I like very much the idea of exposing some sort of semantic APIs

May be once ... Spoon publishes a light model with all algorithms #2747 and then there is higher chance for future reuse.

Maybe it would be possible to start looking at some interfaces,

Spoon model has interfaces ... but may be more heavy then one might expect for pure model.

ftomassetti commented 6 years ago

however to completely change our "semantic AST" to be exactly as the one from Spoon

I understand. It is probably too late to cooperate.

Well, if your position is "you should use our stuff as it is" it would have never worked :)

But I like very much the idea of exposing some sort of semantic APIs

May be once ... Spoon publishes a light model with all algorithms #2747 and then there is higher chance for future reuse.

Interesting, but again, it would not much be a collaboration as one more user for you

Maybe it would be possible to start looking at some interfaces,

Spoon model has interfaces ... but may be more heavy then one might expect for pure model.

pvojtechovsky commented 6 years ago

Well, if your position is "you should use our stuff as it is" it would have never worked :)

both sides has own clients and both sides have to be backward compatible ... at least little bit. So rename of core model types and methods is not acceptable from this point of view.

But if there are nearly 1:1 mapping between our AST models, then it is possible to write a Spoon algorithm which transforms algorithms based on types and names of one side to types and names of other side. That would be way how to share algorithms...

But mapping is probably not 1:1 so such transformation would become very complex task which is hard using current technology. I think that after some years that will be possible, but not now.

however to completely change our "semantic AST" to be exactly as the one from Spoon would probably be as costly as rewriting the higher level API ourselves :)

So if it is nearly same effort then it makes sense for you to do it ;-), because then there will be only one API which can be used by bigger community to write cool algorithms.

pvojtechovsky commented 6 years ago

Well, if your position is ...

my position is that I am open to do a lot to share the effort on Java metaprogramming algorithms. But it is really difficult if both sides (Spoon x JP) already has own incompatible standards.

Looking for some interfaces is a theoretic way ... but nobody wants to have extra interfaces and duplicate methods in models, which are complicated enough - by nature of problem.

I actually see only these possible steps toward each other.

Spoon publishes it's AST as small 0 dependency JAR
we can simplify that AST as much as possible, so it is clear, nice, intuitive and usable for Java metaprogramming generally.
may be in the meantime we apply also some renaming to types and methods and thanks to Semantic patching we can "keep compatibility" ... I mean to support an automated way how to transform code of Spoon clients to be again compatible with new Spoon model.
Then I think it will make sense for others (may be even for JavaParser) to use such Java AST model.

But that is long way ...

Does anybody has another idea ... even wild ideas are welcome ... they might lead to a doable solutions in this "brainstorming".

monperrus commented 5 years ago

Closing this issue, we can still continue to use it for the conversation between the awesome teams of JavaParser and Spoon.

INRIA / spoon

Ideas for cooperation between JavaParser and Spoon contributors #2740