Closed leomylonas closed 4 months ago
Thank you very much for this pull request and for sharing the results of your benchmarking. We will review it within the next few days.
Thanks @kolovos. As far as I can tell, all of the tests are passing. Happy to discuss the changes in more detail with you if you want.
I confirm that all the tests passed for me as well, when run locally through our build-and-test.sh
script. For a moment, I thought we didn't have a test for when we have a protected region end and another start on the same line, but we do cover it in TestCommentBlockPartitioner
(testPartitionTwoProtectedRegions
and testPartitionComplex
) :-).
I'll continue reading through the code, but so far it looks good. It's a clear case of reducing the size of the problem by changing from a multiline regex to just matching on a line-by-line basis with some stateful processing.
Dimitris: it looks like we never allowed a protected region to start and end on the same line. The original code already required the comment that started the protected region to end the line.
The code looks good! I have made some tweaks to avoid magic constants in group numbers, used Java 7 try-with-resources instead of a manual close(), and made the code slightly safer against nulls. I think it's ready to be merged.
We found an exponential increase in the time it takes to generate files based on the number of protected regions. This PR rewrites the
CommentBlockPartitioner
to be much more performant.Some benchmarks below