microsoft / TypeScript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
https://www.typescriptlang.org
Apache License 2.0
99.87k stars 12.36k forks source link

String case change methods should return intrinsic string manipulation types #44268

Open thw0rted opened 3 years ago

thw0rted commented 3 years ago

lib Update Request

Configuration Check

My compilation target is ES2018 and my lib is ["ES2016", "dom"].

Missing / Incorrect Imprecise Definitions

At least toUpperCase and toLowerCase methods on String, maybe/probably also the Locale versions?

Sample Code

type LittleT = "t" | "tt";
type BigT = Uppercase<LittleT>;
declare const smol: LittleT;
const big: BigT = smol.toUpperCase(); // err, string is not assignable to "T" | "TT

Documentation Link

https://tc39.es/ecma262/#sec-string.prototype.tolowercase

Note

I'm not sure about the Locale versions of these methods because I don't know what algorithm the "intrinsic" Uppercase<T> / Lowercase<T> helper types are required to follow. (Do we need separate LocaleUppercase / LocaleLowercase helpers?)

Also worth mentioning: I think what I'm looking for is a return type of e.g. Uppercase<this>, which shouldn't have an impact on non-literal string types because Uppercase<string> is just string.

Nixinova commented 3 years ago

Related:

type Char = 'a' | 'A';

let char: Char = 'a';
char = char.toUpperCase();
//^ Error: Type 'string' is not assignable to type 'Char'.

String methods should be type <X extends string>(str: X): X not (str: string): string.

kohlmannj commented 3 years ago

I have a generated string union type definition like this:

export type Align =
  | 'LEFT'
  | 'CENTER'
  | 'RIGHT'
  | 'DEFAULT';

I also have an existing object literal export containing CSS-in-JS classNames that correspond to Lowercase<Align>:

export const textAlignClasses = {
  left: css``,
  center: css``,
  right: css``,
  /** Avoids TS7053 when using array notation */
  default: undefined,
};

Unfortunately, I cannot currently use .toLowerCase() on a string of type Align to safely key into textAlignClasses:

const textAlign: Align = 'DEFAULT';
const textAlignClass = textAlignClasses[textAlign.toLowerCase()];
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                       TS7053: Element implicitly has an 'any' type because expression of type 'string'
                       can't be used to index type '{ left: string; center: string; right: string; default: undefined; }'.
                       No index signature with a parameter of type 'string' was found on
                       type '{ left: string; center: string; right: string; default: undefined; }'.

I am current working around this limitation with a type assertion:

const textAlign: Align = 'DEFAULT';
const textAlignClass = textAlignClasses[textAlign.toLowerCase() as Lowercase<Align>];
Nixinova commented 2 years ago

String methods should be type <X extends string>(str: X): X not (str: string): string.

Hm, this doesn't seem possible without making the entire String interface generic...

thw0rted commented 2 years ago

@Nixinova it's only those 4 methods (upper/lower plus "locale" versions of same) that would need to change. None of them take a function argument, so the function can't (usefully) be generic.

My initial thought was, for non-primitive types, you can use this in a return value, but I'm actually not sure it's possible with a boxed type. In Playground, I tried interface String { toUpperCase(): Uppercase<this>; }, but this is not assignable to primitive, little-s string, which is the generic constraint of the Uppercase helper type.

I guess that means, the change I'm asking about might have to happen in the compiler rather than the type lib, which I recognize makes it less likely to happen.

DetachHead commented 2 years ago

i tried extending the String interface to add this but no luck since the base String interface doesn't use a generic

declare global {
    //TS2428: All declarations of 'String' must have identical type parameters.
    interface String<T extends string> {
        toLowerCase(): Lowercase<T>
        toUpperCase(): Uppercase<T>
    }
}

const foo = 'FOO'.toLowerCase() //TS2339: Property 'toLowerCase' does not exist on type '"FOO"'.
ajafff commented 2 years ago

Just use a generic this type on the method signatures:

interface String {
    toUpperCase<T extends string>(this: T): Uppercase<T>;
}
thw0rted commented 2 years ago

Thanks @ajafff , I thought I'd tried that earlier without success, but plugging my example into Playground works perfectly. Does this mean a PR would be as simple as changing the signature of those 4 methods?

bfaulk96 commented 2 years ago

Any updates on this? looks like someone started implementing but then quit in December

RyanCavanaugh commented 2 years ago

This issue is "awaiting more feedback". This means we'd like to hear from more people who would be helped by this feature, understand their use cases, and possibly contrast with other proposals that might solve the problem in a simpler way (or solve many other problems at once).

If this feature looks like a good fit for you, please use the :+1: reaction on the original post. Comments outlining different scenarios that would be benefited from the feature are also welcomed.

thw0rted commented 2 years ago

If it helps to have a more detailed use case, I'm trying to exchange data between two systems, and I have an interface that describes the shape of the JSON returned / expected by their REST APIs. One uses lowercase keys, the other uses UPPERCASE, and I want the type checker to be happy with a case-change method, without having to add type assertions. If I do barKey = fooKey.toUppercase() as keyof BarData, I run the risk of missing the error when the structure of one or the other changes in an incompatible way.

bfaulk96 commented 2 years ago

Thanks for the information, Ryan! My use case is very similar to the one provided by thw0rted

Oblosys commented 2 years ago

Maybe it doesn't really constitute feedback, but there was a StackOverflow question about this: https://stackoverflow.com/questions/71646698/modified-template-literal-types/

mycarrysun commented 2 years ago

This issue is "awaiting more feedback". This means we'd like to hear from more people who would be helped by this feature, understand their use cases, and possibly contrast with other proposals that might solve the problem in a simpler way (or solve many other problems at once).

If this feature looks like a good fit for you, please use the 👍 reaction on the original post. Comments outlining different scenarios that would be benefited from the feature are also welcomed.

Wanted to comment as you mentioned you wanted to hear if others find this useful. I was searching for a workaround for this exact scenario so yes it helps me a lot. We are eslint banning type assertions in our code base so would rather not have to do type assertions and augmenting built in types is not a great solution. Plus this is the expected behavior since this is how the function actually works.

tomerghelber-tm commented 1 year ago

I have a question that seems related, but if other methods like trim could keep the original string tye (Uppercase if was Uppercase, string if was string, and Lowercase if was Lowercase)? Same with substring, and split.

The tighter the types the better.

DetachHead commented 1 year ago

@tomerghelber-tm my ts-helpers package includes a trim function that keeps track of the value at compiletime, as well as many others:

const foo = trim('  foo  ') // type is "foo"

it would be cool if these were in the lib types, but i imagine performance is a concern with stuff like this

MauritsWilke commented 1 year ago

This issue still persists, is there anything planned? Also just curious, is there a reason this is hard to implement?

thw0rted commented 1 year ago

It would be neat if trim worked too, though I assume there are weird edge cases in the spec for what counts as "whitespace". Is there an intrinsic type that already handles this? (If not, maybe there should be?)

ethanresnick commented 1 year ago

I'd love this too. My use case is that a third party library returns values in lowercase, and I'm uppercasing them to match against pre-existing object keys. Currently, code looks like:

const result = getResult(); // returns a lowercase literal type string value
const lookupResult = LookupTable[result.toUpperCase() as Uppercase<typeof result>];

Just seems ugly

mycarrysun commented 1 year ago

I'd love this too. My use case is that a third party library returns values in lowercase, and I'm uppercasing them to match against pre-existing object keys. Currently, code looks like:

const result = getResult(); // returns a lowercase literal type string value
const lookupResult = LookupTable[result.toUpperCase() as Uppercase<typeof result>];

Just seems ugly

https://github.com/microsoft/TypeScript/issues/44268#issuecomment-943860077 this is a workaround that's better than manually casting the type

ethanresnick commented 1 year ago

Thanks for the tip. Still seems like this should be built in.

mycarrysun commented 1 year ago

Thanks for the tip. Still seems like this should be built in.

Agreed - that's what this issue is about is making it a built in type definition

ethanresnick commented 1 year ago

@RyanCavanaugh Since this issue is "awaiting more feedback" as opposed to "help wanted", I guess that means that the TS team thought that updating the typings might not be a good idea for some reason? Can you spell out what that reason is? I'm really struggling to see the downside of this change.

RyanCavanaugh commented 1 year ago

All features incur risk and maintenance cost, especially the ones you think won't

bfaulk96 commented 1 year ago

While I understand the desire for caution, that seems like a non-answer to @ethanresnick's question

RyanCavanaugh commented 1 year ago

All features start at minus 100; the default answer to any feature request is "no" in the absence of a strong case in favor of it. Looking into these features and exploring their potential downsides is itself a cost, as evidenced by the time I'm spending constructing this comment itself (both in terms of typing it and the research I did to create it). We can't ship every feature all at once without creating chaos, so we are really only able to spend finite engineer time and risk on things that provide really clear upside, which I don't think this proposal does.

The presumed declaration we'd add is

interface String {
    toUpperCase<T extends string>(this: T): Uppercase<T>;
    toLowerCase<T extends string>(this: T): Lowercase<T>;
}

Now, notably, you can already do this in userland today by writing this exact snippet anywhere in the global scope. So in that sense, the feature is there and available for anyone who wants to use it; no one is blocked.

Should this then be the default behavior by removing the other overloads? What would this break? Here's an example you could imagine (using 2 here so we can just try it without overload weirdness)

interface String {
    toUpperCase2<T extends string>(this: T): Uppercase<T>;
    toLowerCase2<T extends string>(this: T): Lowercase<T>;
}

const m = "hello";
const arr = [m.toLowerCase2()];
arr.push("world"); // <- now an error
ethanresnick commented 1 year ago

@RyanCavanaugh I agree that the value of the proposed change here is relatively small, in that this it's serving a fairly rare use case and, as you pointed out, 'fixing' this is possible in userland already. So I appreciate you taking the time to construct the example.

My read of that example, though, is that it might be highlighting a different TS issue. It shows that a type produced by Uppercase/Lowercase is treated differently than a literal string type for the same string; i.e., this example compiles just fine:

const m = "hello";
const arr = [m];  // m has a literal type, but arr is still inferred as string[]
arr.push("world");

I think a lot of people would find it quite weird/unintuitive that arr gets typed as string[] in the example above, but as "hello"[] when m gets passed through Lowercase first. My expectation certainly would've been that [m.toLowerCase2()] also results in arr having type string[]. So maybe the example is showing something undesirable about how Uppercase/Lowercase interact w/ contextual typing? I did try to do some research in the issues/PRs to see if I could find anything documented about the rationale for that difference (if it's intentional), but I didn't find anything.

If that different treatment for Uppercase/Lowercase is desirable, though, than I think I agree with you that changing the default definition for toLowerCase and toUpperCase might be more breaking than helpful.

RyanCavanaugh commented 1 year ago

I think the root cause is that Uppercase and Lowercase, when given a widening literal type, give back a non-widening literal type. They could likely be changed to preserve the widening status of their input type, though (predicting downstream consequences of doing that is beyond the reach of my mental model of the world). Maybe there's a constructable example that shows why doing that would be desirable, not sure. Our intent for Uppercase and Lowercase was really only to do type-system-land operations so their behavior interacting with literal values isn't well-trod ground.

ethanresnick commented 1 year ago

@RyanCavanaugh Makes sense. I’ll open an issue about having these intrinsic types preserve the widening status of their input, to at least start a discussion about that — though I’ll also have to try to think of an example first.

If the intrinsic types were changed like that, would you support changing the toLowerCase/toUpperCase definitions per this thread? Even if the signatures proposed here are only better in rare cases, it seems like those cases are common enough (given the thumbs up on this issue) that these revised signatures ought to be the default, if we can’t come up with another example of them causing any problems

wbt commented 1 year ago

Another example use case is in https://github.com/microsoft/TypeScript/issues/50548#issuecomment-1329586849 where even what is supposed to be a highly simplified example still requires a casting function:

//Bonus issue(#44268): the cast in the return statement of the next line should be automatic & unnecessary:
const toUpperCaseTyped = function<S extends string>(strIn: S) {return strIn.toUpperCase() as Uppercase<S>;};
tontonsb commented 1 year ago

I'll add some straightforward examples for thought. I came across some of these while I was trying to ensure that I've done the necessary .toUpperCase() somewhere along the way.

If we have this function

function greet(name: Uppercase<string>): void {
    console.log(name)
}

Here's how the actual behaviour compares to what, in my opinion, would be natural to expect:

Is an error Is fine
Should be an error greet('Joe')
3 as Uppercase<string>
greet('Joe' as Uppercase<string>)
(x: string) => x as Uppercase<string>
(x: Uppercase<string>) => x as Lowercase<string>
'JOE' as Uppercase<string> as Lowercase<string>
Should be fine greet('Joe'.toUpperCase()) greet('JOE')

The most counter-intuitive things are that 'Joe'.toUpperCase() (an actual uppercase string) is not accepted as Uppercase<string>, but 'Joe' as Uppercase<string> (not a real uppercase string) is. That's probably outside the scope of this issue, but conversion as Uppercase<string> should, at the very least, be incompatible with the other cases.

DetachHead commented 1 year ago

@tontonsb

(x: string) => x as Uppercase<string>

casting string to Uppercase<string> shouldn't be an error for the same reason casting string to "foo" isn't an error, because it's valid (though unsafe) to narrow types this way.

casting should only cause an error if the types don't overlap though (ie. casting string to number). so casting Lowercase<string> to Uppercase<string> should be an error imo. i'm surprised it isn't

wbt commented 1 year ago

@DetachHead Case transformations aren't very simple, but this issue seems like it should be quite readily solvable.

NBMSacha commented 1 year ago

Any news on this?

Gsiete commented 7 months ago

This issue is "awaiting more feedback". This means we'd like to hear from more people who would be helped by this feature, understand their use cases, and possibly contrast with other proposals that might solve the problem in a simpler way (or solve many other problems at once).

If this feature looks like a good fit for you, please use the 👍 reaction on the original post. Comments outlining different scenarios that would be benefited from the feature are also welcomed.

Hey, how many thumbs up would you need to consider solving this issue?

RyanCavanaugh commented 7 months ago

There is no specific number; all open feature suggestions are considered holistically. Given that you're already free to opt into this in your project today (see prior comments), it's not a particularly high-priority thing.