eclipse / epsilon

Epsilon is a family of Java-based scripting languages for automating common model-based software engineering tasks, such as code generation, model-to-model transformation and model validation, that work out of the box with EMF (including Xtext and Sirius), UML (including Cameo/MagicDraw), Simulink, XML and other types of models.
https://eclipse.org/epsilon
Eclipse Public License 2.0
53 stars 11 forks source link

EGL: partitioning performance improvements #77

Closed leomylonas closed 4 months ago

leomylonas commented 4 months ago

We found an exponential increase in the time it takes to generate files based on the number of protected regions. This PR rewrites the CommentBlockPartitioner to be much more performant.

Some benchmarks below

image

kolovos commented 4 months ago

Thank you very much for this pull request and for sharing the results of your benchmarking. We will review it within the next few days.

leomylonas commented 4 months ago

Thanks @kolovos. As far as I can tell, all of the tests are passing. Happy to discuss the changes in more detail with you if you want.

agarciadom commented 4 months ago

I confirm that all the tests passed for me as well, when run locally through our build-and-test.sh script. For a moment, I thought we didn't have a test for when we have a protected region end and another start on the same line, but we do cover it in TestCommentBlockPartitioner (testPartitionTwoProtectedRegions and testPartitionComplex) :-).

I'll continue reading through the code, but so far it looks good. It's a clear case of reducing the size of the problem by changing from a multiline regex to just matching on a line-by-line basis with some stateful processing.

Dimitris: it looks like we never allowed a protected region to start and end on the same line. The original code already required the comment that started the protected region to end the line.

agarciadom commented 4 months ago

The code looks good! I have made some tweaks to avoid magic constants in group numbers, used Java 7 try-with-resources instead of a manual close(), and made the code slightly safer against nulls. I think it's ready to be merged.