hvesalai / emacs-scala-mode

The definitive scala-mode for emacs
http://ensime.org
GNU General Public License v3.0
362 stars 68 forks source link

Fix exponential backoff in multiline string regexp #157

Open abgruszecki opened 4 years ago

abgruszecki commented 4 years ago

Hello there!

We've found an issue where emacs would freeze when using scala-mode scrolling to line https://github.com/lampepfl/dotty/blob/master/compiler/src/dotty/tools/dotc/typer/Namer.scala#L1199. If you put the point at the beginning of the trailing """ and call scala-syntax:looking-at-simplePattern-beginning, Emacs will freeze. I've traced this down to the scala-syntax:multiLineStringLiteral-re regexp.

Since the middle of the replaced regexp was a greedy match, it required exponential time to match (or fail to match) a very long literal.

After making * lazy, things no longer freeze.

I tried to make the change as unintrusive as possible. scala-syntax:multiLineStringLiteral-start-re could probably be changed as well. Let me know if I should continue to dig into this, I'd be willing to do it if I had some help.

CLAassistant commented 4 years ago

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

hvesalai commented 4 years ago

Thank you for the PR. Yes please modify the scala-syntax:multiLineStringLiteral-start-re, it doesn't have to be greedy.

abgruszecki commented 4 years ago

Just checked, making that regex lazy breaks highlighting. Sample: https://github.com/lampepfl/dotty/blob/master/compiler/test/dotty/tools/vulpix/ParallelTesting.scala#L95. Everything from that line onwards is HL-ed as string. Does this ring a bell? I'll look into it later.

hvesalai commented 4 years ago

Ah, I was wrong in my analysis. The start re actually matches the whole string, not just the start.

\\(\"?\"?[^\"]\\)* This is saying that for * time, skip over things that are anything but """ (so "" and something else is ok).

hvesalai commented 4 years ago

Now, if you make that reluctant, then it will stop before the end of the multi-line string, and thus the next """ (which is supposed to end the string) is seen as a start of the next string.