microsoft / TypeScript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
https://www.typescriptlang.org
Apache License 2.0
101k stars 12.48k forks source link

Template strings: Negated match #49867

Open jamiebuilds opened 2 years ago

jamiebuilds commented 2 years ago

Suggestion

📃 Motivating Example

Strings, in particular key names, sometimes affect types based on the format of the string, most commonly with prefixes. One such example is the recent W3C Design Tokens spec which uses a $ prefix to reserve key names like $description and $type.

Right now it's possible to create a positive match such as:

type CssCustomPropertyName = `--${string}`

But there's no way to create a negative match. Or in regex terms:

(?!--)   # not `--`

The goal here wouldn't be to recreate all the functionality of regex/parsing, it would be to handle the stringy-typing sometimes seen in JavaScript (including within official web specifications).

⭐ Suggestion

Note: This is one option, I'm not particularly tied to it and could suggest alternatives if this is not workable.

Expose, in a limited capacity, the not operator (#29317) so that it can be used to filter strings.

type Name = string & not `--${string}`

let a: Name = "--x" // err!
let b: Name = "--" // err!
let c: Name = "x" // ok
let d: Name = "x--" // ok
let e: Name = "-x" // ok

There is an existing PR adding a not operator, but has been long stalled on expected behavior. But maybe just this one piece could be added and slowly expanded from, if desired, later.

💻 Use Cases

The W3C Design Tokens spec does not allow token/group names starting with a $ or containing any of the characters .{}.

Using negated string matches, you could correctly type this:

type Name = string & not `$${string}`

interface Group {
    $description?: string;
    $type?: string;
    [key: Name]: Token | Group
}

The closest you can get to this today is:

type LowerAlpha = 'a' | 'b' | 'c' | 'd' | 'e' | 'f' | 'g' | 'h' | 'i' | 'j' | 'k' | 'l' | 'm' | 'n' | 'o' | 'p' | 'q' | 'r' | 's' | 't' | 'u' | 'v' | 'w' | 'x' | 'y' | 'z'
type UpperAlpha = Uppercase<LowerAlpha>
type Numeric = '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9'
type AlphaNumeric = LowerAlpha | UpperAlpha | Numeric
type SafeSymbols = "-" | "_" | " "

type NameStart = AlphaNumeric | SafeSymbols
type Name = `${NameStart}${string}`

type Group = {
    $description?: string;
    $type?: string;
} & {
    [key in Name]: Group
}

let group: Group = {
    $description: "",
    $type: "",
    "my cool name": {}
}

🔍 Search Terms

List of keywords you searched for before creating this issue. Write them down here so that others can find this suggestion more easily and help provide feedback.

✅ Viability Checklist

My suggestion meets these guidelines:

RyanCavanaugh commented 2 years ago

If we presuppose not T as the type negation operator, then it's tempting to support this (initially) via

type Name = `${not "--"}${string}`;

which we can parse only in the context of a ${ ... } placeholder. Thoughts?

jamiebuilds commented 2 years ago
type Name = `${not "--"}${string}`;

@RyanCavanaugh That was my initial idea, but presumably the type not "--" also matches an empty string "".

So given the string "--abc" and the template type `${not "--"}${string}` you could argue:

Negating an entire pattern is less ambiguous:

string & not `--${string}`
RyanCavanaugh commented 2 years ago

I think the empty string matching argument is correct under one interpretation of how that should work. I guess the problem is that if not "x" exists, then ${not "x"} needs to mean something but I agree it's super unclear what given that "" is a valid ${string}.

Is it as simple as for ${not ("a" | "bb")}, then the entire containing template is a not-match if s.indexOf("a") === 0 || s.indexOf("bb") === 0 ? But then it's unclear how many characters to consume.

I feel like we have to handle it at a placeholder level since the template strings can already be nested, so we'd have to have answers for what it means to write

type Part = not `${'a' | 'b'}`;
type Whole = `${Part}${Part}${Part}`;
jamiebuilds commented 2 years ago

Maybe someone else does, but I don’t have any particularly more complicated cases than the one I mentioned. So I would accept a solution that simply doesn’t allow you to interpolate negated strings into other template types.

type A = not "a"
type B = `${A}` // invalid
fatcerberus commented 2 years ago

w.r.t. the empty string issue, isn't it already the case that when there are consecutive placeholders, each one except for the last must consume at least one character (in order to enable recursive ${head}${tail} style template string types, IIRC)? So something like ${not "--"}${string} shouldn't be an issue.

RyanCavanaugh commented 2 years ago

Consecutive placeholders will consume one character at a time if present, but will still produce "":

type Three = `${string}${string}${string}`;
const a: Three = "mm";

is considered to be effectively "m", "m", ""

fatcerberus commented 2 years ago

Right, but the empty string can only be produced by the final one unless there aren't enough characters to begin with. That's why I said "except for the last" above. In other words, ${not "--"}${string} should never be able to produce, e.g., "", "--foo"

jamiebuilds commented 2 years ago

Thinking about this more, I wonder if a lot of the concerns about the original not proposal could be addressed by starting with a more limited feature only for primitive types and literals:

type NonEmptyString = string & not ""
type NonZeroNumber = number & not 0
type NonSymbolPropertyKey = PropertyKey & not symbol
interface Nums {
  label: string
  [index: string & not "label"]: number
}
interface DesignTokens {
  $type: string
  $description: string
  [index: string & not `$${string}`]: any
}

From the original pull request (#29317), from what I can gather, these were the primary concerns: (cc @DanielRosenwasser)

  1. Intuition: Some use-cases don't behave intuitively
// (this is an error because objects in TS are not 'exact')
let o: not { bar: string } = { foo: string } // Error!
  1. Consistency: Can this feature compose with every other feature in TS and get expected results?

  2. Complexity: This is going to have to be considered in every future change to TypeScript, and could get in the way of more useful features.

  3. Use cases: A lot of the use-cases for this feature can already be implemented in other (less declarative) ways

Flexibility around primitive and literal types is the main use case that I'm concerned with, and I think I would actually prefer if TypeScript told me when I was trying to do silly stuff like not any because that sounds like a mistake to me.

matthew-dean commented 1 year ago

Why not just allow regex types instead of re-inventing the wheel of regex? There are plenty of use cases. For example, creating a type that only allows a valid CSS hex color as a value is extremely non-trivial.

verikono commented 1 year ago

@matthew-dean : Regex would be fricken sensational.

there is a library for this but performance may be problematic: https://github.com/didavid61202/type-level-regexp