I understand that Go's regexp packages uses finite automaton. Working on uap-rust, I learned that uap-core's extensive use of large bounded repetition (.{x,y} with a large y) causes a state explosion in FA engines (which makes sense), you can see some of it at https://github.com/rust-lang/regex/discussions/1206#discussioncomment-9976284 as I asked the rust-regex author for assistance (I was investigating the large memory use of my rust-regex-based implementation compared to a C++ re2 implementation).
Assuming Go's regexp is inspired by or derived from re2, it should be much less sensible than rust-regexp, and should not have the performance traps / cliffs of jitted DFA, but I would still expect that the bounded repetitions significantly increase memory use for no value.
Experimentally, in uap-rust I found that converting repetitions with an upper bound of 100 or more (so 3+ digits) had no functional implication, with all of those having been added to mitigate catastrophic backtracking. Below 100 things are more risky, I found at least one regex which uses a functional repetition with an upper bound of 50.
I understand that Go's regexp packages uses finite automaton. Working on uap-rust, I learned that uap-core's extensive use of large bounded repetition (
.{x,y}
with a large y) causes a state explosion in FA engines (which makes sense), you can see some of it at https://github.com/rust-lang/regex/discussions/1206#discussioncomment-9976284 as I asked the rust-regex author for assistance (I was investigating the large memory use of my rust-regex-based implementation compared to a C++ re2 implementation).Assuming Go's regexp is inspired by or derived from re2, it should be much less sensible than rust-regexp, and should not have the performance traps / cliffs of jitted DFA, but I would still expect that the bounded repetitions significantly increase memory use for no value.
Experimentally, in uap-rust I found that converting repetitions with an upper bound of 100 or more (so 3+ digits) had no functional implication, with all of those having been added to mitigate catastrophic backtracking. Below 100 things are more risky, I found at least one regex which uses a functional repetition with an upper bound of 50.