Optimise Scanner performance

Liquidsoul commented 5 years ago

This is a rebase of #207

✅ What I did:

squash of the commits
integrate the upstream changes about range reporting to fix the tests during the rebase process
remove Swift 3 from the build matrix

❌What I did not do:

benchmark if there is an actual performance improvement

Liquidsoul commented 5 years ago

@yonaskolb PR rebased and entry added to the changelog

yonaskolb commented 5 years ago

So I did some benchmarking using SwagGen as a test harness and using XCTest to measure just the stencil file generation

Results in seconds:

0.11.0 - 0.24
207 - 0.15
0.12.0 - 5.9
228 (this pr) - 4.5

So while this PR improves things slightly it seems something was added after the original #207 that slowed down things even more.

After some more investigation it seems #167 caused the slowdown before: 0.25 after: 6.0

yonaskolb commented 5 years ago

@ilyapuchka as the author of #167 do you have any other insights here?

yonaskolb commented 5 years ago

Interestingly just timed #225 and it was 0.37s

ilyapuchka commented 5 years ago

@yonaskolb I suppose something around here might have caused performance issues https://github.com/stencilproject/Stencil/pull/167/files#diff-91493e167a3f6240a7cc916d9504d439R25

ilyapuchka commented 5 years ago

I noticed recently that parsing things like func some() {{% some tag %} fails due to usage of {{%. I see that there are some changes to the parsing of starting tokens here and wonder if this is still the case with this change and, if so, if it can be addressed @Liquidsoul

Liquidsoul commented 5 years ago

I've tried to tokenize func some() {{% if %} using this updated Lexer and the existing one. Here is the results:

the new lexer does not handle this well and crashes even.
the old lexer fails to parse the if tag but does not stop the execution.

So the new lexer does solve the issue and introduces a new problem. Instead of failing to capture the tag as in the old one the new lexer crashes. I see that as a blocking regression.

The fact that we did not have a testing context showing the improvement in #207 makes it hard to see this as a working optimisation. And since the original author does not seem to use Stencil anymore maybe we should just drop this entirely, see if this reported again and just address the existing performance issue.

What do you think?

ilyapuchka commented 5 years ago

I agree, we should better address performance regression introduced by #167

ilyapuchka commented 5 years ago

@yonaskolb can you create a PR with your performance tests?

djbe commented 5 years ago

228 has been merged so this should be next.
228 introduced a performance loss in this commit: 93ccc56
Investigate if String/Substring is the performance issue, and how to solve this.

djbe commented 5 years ago

@yonaskolb You've mentioned a few times that you timed the tests. Are you just running them locally and checking how long yourself (with time diff or something), or do you have an actual test case we could run on CI?

yonaskolb commented 5 years ago

I've just been timing locally using an XCTest in SwagGen while pointing to specific Stencil commit. I don't know of a way to run performance tests on CI as to fail on performance regressions you need to set a baseline and that won't be stable across different machines

djbe commented 5 years ago

I don't think we need a test that "fails", more a test that might report the change, either in CI, or here in the PR. For example, SwiftLint reports performance changes in the PR: https://github.com/realm/SwiftLint/pull/2396.

djbe commented 5 years ago

Now that whole process would be a separate PR, but just for this commit, so we can test locally, a simple test that runs a few times, and logs the average time would be nice.

AliSoftware commented 5 years ago

We definitely need to configure Danger on those repos at some point ❤️

ilyapuchka commented 5 years ago

@Liquidsoul I created #235 for mentioned issue

djbe commented 5 years ago

So, going through the comments again:

167 (error log improvements) ~~added a massive slowdown, we should try to figure out why and solve this in a separate PR.~~ Solved with #230.
This PR seems to have a net positive effect, similar to the old #207.
Both this and the old #207 PR add a bug with {{%, this needs to be solved before merging this.
We need to decide what {{% ... should do, and add a test for it (see #235).
We need some form of performance tests to keep track of things.

Liquidsoul commented 5 years ago

@djbe the {{% error already exists in the current code base, this is not something that's introduced by this change. However, as I said here, this PR introduces a crash instead of a parsing failure when there is a syntax error in the stencil file. So as I said, if we want to implement this optimisation, we may wish to rewrite this to ensure that no regressions are introduced.

djbe commented 5 years ago

I get what you mean, but a crash is an order of magnitude worse. And like I said, it’s not really an error unless you define what the syntax should be for such edge cases.

djbe commented 5 years ago

Right, so I completely missed #230, which fixed the performance issues from #167 (error improvements).

I've rebased again, and rewritten the PR a bit to squeeze as much performance as I can. I've also fixed the crash, so parsing will continue when it encounters {{%. One thing of note is that the range = ... calculations (added with the error improvements) also impact performance a bit, so I've tried minimising those. Funnily enough, switching to Substrings where possible has a negative impact on performance 🤔.

Using @yonaskolb's test method (SwaGen) this goes from (an average of) 0.656s to (an average of) 0.419s.

yonaskolb commented 5 years ago

The new test will have to be updated to the Spectre 9.0 format

djbe commented 5 years ago

@yonaskolb What do you mean the new format? The only difference that I saw in that PR, was just an XCTestCase class added in each file, which this PR already has since it's been rebased on master.

ilyapuchka commented 5 years ago

Can we add a performance test?

djbe commented 5 years ago

I don't know, you can only really test it with really (really) large templates, or a whole bunch of templates. Maybe leave that for a separate PR?

ilyapuchka commented 5 years ago

we could just add whatever project you are using right now to measure improvement and use its templates in a test?

djbe commented 5 years ago

I'm using SwaGen (what @yonaskolb was using), it's a whole project, with a whole bunch of templates, and the test isn't perfect as it's a full render, not just the lexer.

djbe commented 5 years ago

I've added a performance test based on the template in #224, repeated a few times 😆.

There's no reporting yet, we'll leave that for a different PR (Danger).

djbe commented 5 years ago

@yonaskolb @ilyapuchka any final thoughts? I'd be ideal to have this in for 0.13.

stencilproject / Stencil

Optimise Scanner performance #226

207 - 0.15

228 (this pr) - 4.5

228 has been merged so this should be next.

228 introduced a performance loss in this commit: 93ccc56

167 (error log improvements) added a massive slowdown, we should try to figure out why and solve this in a separate PR. Solved with #230.