Open srenault opened 3 years ago
@sjrd you remember anything about that?
The "possessive quantifier" is necessary in that case on the JVM:
scala> val longish = "X" * 3000
val longish: String = XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX...
scala> val r = raw"""(X|Y)*+""".r
val r: scala.util.matching.Regex = (X|Y)*+
scala> r.findAllMatchIn(longish).map(m => m.group(0).length()).toList
val res4: List[Int] = List(3000, 0)
scala> val r = raw"""(X|Y)*""".r
val r: scala.util.matching.Regex = (X|Y)*
scala> r.findAllMatchIn(longish)
java.lang.StackOverflowError
at java.base/java.util.regex.Pattern$GroupTail.match(Pattern.java:4832)
Your stack may vary.
Your regex library may different on other platforms; the linked PR is for supporting JS. ("Java regular expressions support possessive quantifiers, but JavaScript does not," says an undergrad web page.)
https://next.sonarqube.com/sonarqube/coding_rules?open=java%3AS5998&rule_key=java%3AS5998
https://docs.oracle.com/javase/tutorial/essential/regex/quant.html
A good interview question would be, "So are you greedy, reluctant, or possessive?"
Has anyone tried to determine whether it's possible here for us to both be JS-friendly, and avoid consuming so much stack on the JVM?
reproduction steps
using Scala 2.12 or 2.13
problem
In the process of upgrading scala from 2.11 to 2.13, we are now hitting a StackOverflow error when using the scala parser combinator with the exact same code. After investigating that issue, it turns out this is related to a change made on the stringLiteral parser regex.
In scala 2.11, the regex was
("\""+"""([^"\p{Cntrl}\\]|\\[\\'"bfnrt]|\\u[a-fA-F0-9]{4})*+"""+"\"").r
and now it's("\""+"""([^"\x00-\x1F\x7F\\]|\\[\\'"bfnrt]|\\u[a-fA-F0-9]{4})*"""+"\"").r
.The commit that changed the regex is scala/scala-parser-combinators@98737a2#diff-bce6263cece0f67933d12d2f709294696de6c553a335a8123819f0f97aa6587bL52.
The PR description states:
We also removed an invalid (and useless) + at the end of the regexp.
I'm not sure to get why the
+
is invalid. Can someone give me more details?