Modularization/Splitting of Spoon

INRIA / spoon

Spoon is a metaprogramming library to analyze and transform Java source code. :spoon: is made with :heart:, :beers: and :sparkles:. It parses source files to build a well-designed AST with powerful analysis and transformation API.

http://spoon.gforge.inria.fr/

Other

1.74k stars 345 forks source link

Modularization/Splitting of Spoon #2747

Closed pvojtechovsky closed 5 years ago

pvojtechovsky commented 5 years ago

There are several reasons why it might make sense to split Spoon into several smaller pieces

R1) There are clients who are not using Maven, Decompiler, Eclipse JDK compiler, ... and complains about too many dependencies

R2) There might be communities who are thinking about development of java analyzer and they might appreciate a light (0 dependency) model of Java Semantic AST.

R3) Then they might be interested:

in printing of such AST
in traversing, matching, refactoring, querying, ... ... all that might be quite light highly usable libraries with little dependencies.

For example a transpiler from C to Java does not need Eclipse compiler, but needs Semantic AST + PrettyPrinter.

May be the Spoon community might grow faster with such smaller products/modules. Because there would be a higher amount of use cases.

WDYT?

For example I am not using Eclipse modeling tools, because they had (I do not know current state) very heavy dependencies. I would need light modeler. ... even for maintenance of Spoon meta ... meta model the third party light modeling tool would be nice to have ...

nharrand commented 5 years ago

Do you mean separated repositories, or just publishing multiple smaller artifacts?

pvojtechovsky commented 5 years ago

It is a technical detail from my point of view.

Actually I tend to publishing multiple smaller artifacts, because it seems to be easier. But if there would be a good reason to separate repositories ...

monperrus commented 5 years ago

There are three levels of modularity:

repositories
code modules
deployed units

I'm in favor of keeping one unique repository, because I consider a repo as a unit of community and conversation, not as a unit of code.

In Maven, the general rule is that there is a one-to-one relation between code module and deployed unit (one Maven module = one jar), however this one-to-one relation can be questioned, either in the context of Maven, or outside Maven.

The questions to answer first are: what the maximum size of a jar we can accept? why? have we reached that limit already?

nharrand commented 5 years ago

It is a technical detail from my point of view.

I agree but:

I'm in favor of keeping one unique repository, because I consider a repo as a unit of community and conversation, not as a unit of code.

Yes that was my main concern too.

pvojtechovsky commented 5 years ago

The questions to answer first are: what the maximum size of a jar we can accept? why? have we reached that limit already?

My primary reason for splitting of Spoon is not about size of JAR. It is about message like

The Spoon lib is not a monolith which is tightly bound to heavy Eclipse compiler etc.

but

The Spoon model and algorithms can be successfully used without any extra dependencies

The Spoon model can be used on it's own

... so it can become more visible that it contains a nice small reusable components, which can be integrated into other tools too...

I think that Spoon architecture is quite near to that message.

... but as long as we do not try to compile AST and related algorithms without Eclipse compiler libs, etc. we (and others too) cannot be sure about that...

Actually nice Factory links Environment, which is quite Spoon specific ... but it isn't a show stopper.

monperrus commented 5 years ago

so it can become more visible that it contains a nice small reusable components

"Nice", "small" and "reusable" are subjective adjectives and subject to endless discussion. A maximum Jar size in Mb is objective, and I'd be happy to hear about any other objective criteria we can put in CONTRIBUTING.md to document the decision.

(that said, I'm pretty much in favor in moving the experimental code such as decompilation in a submodule)

pvojtechovsky commented 5 years ago

Example: If JavaParser community decides (they cannot, but it is an example) to use Semantic AST model and related algorithms in their project, they have to include the dependency to Spoon. That will cause many unwanted transient dependencies, which has to be manually disabled on their side. Then they have to found a correct set of interfaces/implementation which are still working for them without these extra dependencies. And that is not minor work, which can be done only by experienced Spooner and which has to be done repeatedly for each such community and should be checked after each release of Spoon. The client community also cannot be sure that such solution will work in next versions of Spoon, because we do not know which part of Spoon must be independent and we can easily bring new unwanted dependency there.

And because Spoon doesn't deliver such small independent artifacts, there is probably no (or only few) communities who are using it this way, because of high effort and risk that such approach fails in future.

It is my guess/feeling. I can be wrong of course.

stefanleh commented 5 years ago

Being able to decide which parts of Spoon to use and depend on is a benefit in any case. Of course modularity can be exaggerated but splitting in logical and conclusive units makes sense. Reusability also often leads to a bigger community, as Pavel also stated in his inital post.

monperrus commented 5 years ago

I fully agree with @stefanleh.

Modularization must be lagom, neither monolith nor exceedingly small.

pvojtechovsky commented 5 years ago

So what parts we might think about?

Here is list of very small modules M1) Spoon AST (including Factory, change listener, Vistitors, ClonninScanner, ReplaceScanner, Query engine, pretty printer...). It needs only little external dependencies M2) All algorithms working with AST only (like Pattern, Refactor, Sniper printer, ...). It needs only little external dependencies M3) maven launcher M4) building of AST by Eclipse JDK compiler M5) building of AST from runtime. May be this must be part of M1 ... because of many internal dependencies? M6) building of AST of methods from runtime using decompiler What else?

I do not think we should make a JAR for each module. But to combine related modules into JARs which makes sense. Something like 4 Spoon artifacts:

A) M1 + M2 + M5 - Semantic AST model + algorithms. light code - only little dependencies
B) A + M4 - minimal Spoon - dependency to Eclipse compiler
C) B + M3 - maven support - dependency to maven
D) B + M6 - byte decompiler - dependency to decompiler(s)

WDYT?

monperrus commented 5 years ago

In the long term, we may arrive at this solution. But I would go gradually, otherwise it will be a nightmare for pull-requests and for clients to update their dependencies correctly.

So gradually, one by by one, we could start with:

spoon-core (the name in Maven Central) = A+B+C
spoon-decompiler = D

After our beloved baby PRs, this is a ... baby modularization step

pvojtechovsky commented 5 years ago

But I would go gradually, otherwise it will be a nightmare for pull-requests

yes! I would delete my Spoon workspace and deleted github account ;-) if somebody would merge a PR which moves files into different project ... all my WIP PRs would be hard to merge then. So we should wait for a good time for that moving.

monperrus commented 5 years ago

Now that we have spoon-core + two submodules, we can consider that the modularization architecture is in place.