This would involve updating the static parser to give more detailed feedback and inference for regex literals embedded in strings.
The general goals would be:
If a RegExp will throw an error when instantiated via new RegExp, we should try and have a corresponding type error describing the problem
If a RegExp implies anything about the contents of a string that would match it that can be expressed using template literals, we should infer its type.
I think there may be limitations on (2) in particular (lookaheads etc could be complicated?) but we can go for the low hanging fruit, e.g. literal strings embedded in regex.
Here are some basic cases that should be covered:
// should be a parse error
type("/(unterminated_group/")
// should be inferred as `${string}foo${string}`
type("/foo/")
// should be inferred as `foo${string}`
type("/^foo/")
// should be inferred as `${string}foo`
type("/foo$/")
// should be inferred as `foo${string}bar`
type("/^foo.*bar$/")
Cases like this could also work. It will quickly create types that are not performant to use, so ideally we should internally have a way to bail out if the expressions gets too complex and return string if we end up supporting stuff like this:
// should be inferred as `${string}${0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9}${string}`
type("/\\d/")
// should be inferred as `a${0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | ""}b`
type("/^a\\d?b$/")
It wouldn't work with standard raw regex literals like:
// can't narrow because TS just types it as RegExp
type(/^foo.*/)
This would involve updating the static parser to give more detailed feedback and inference for regex literals embedded in strings.
The general goals would be:
I think there may be limitations on (2) in particular (lookaheads etc could be complicated?) but we can go for the low hanging fruit, e.g. literal strings embedded in regex.
Here are some basic cases that should be covered:
Cases like this could also work. It will quickly create types that are not performant to use, so ideally we should internally have a way to bail out if the expressions gets too complex and return string if we end up supporting stuff like this:
It wouldn't work with standard raw regex literals like: