Enhancement: [no-unnecessary-type-parameters] Special case tuples and parameter arrays as a "single use" generic type

danvk commented 2 months ago

Before You File a Proposal Please Confirm You Have Done The Following...

[X] I have searched for related issues and found none that match my proposal.
[X] I have searched the current rule list and found no rules that match my proposal.
[X] I have read the FAQ and my problem is not listed.

My proposal is suitable for this project

[X] I believe my proposal would be useful to the broader TypeScript community (meaning it is not a niche proposal).

Link to the rule's documentation

https://typescript-eslint.io/rules/no-unnecessary-type-parameters/

Description

For purposes of counting whether a generic type parameter is used twice, T[] / Array<T> should only count as using it once. I'm not aware of any realistic scenario where this would lead to a false positive.

We decided to defer this to a follow-up in the initial implementation (see links below). So here's an official request for the follow-up.

Fail

declare function getLength<T>(xs: readonly T[]): number;

Pass

declare function getLength(xs: readonly unknown[]): number;

Additional Info

See https://twitter.com/jfet97/status/1810703565280759964 and https://github.com/typescript-eslint/typescript-eslint/pull/8173#issuecomment-1879695591 for context on why this doesn't work as expected today.

See https://github.com/typescript-eslint/typescript-eslint/pull/8173#discussion_r1461182890 for previous discussion of special-casing Array and friends.

See https://github.com/typescript-eslint/typescript-eslint/pull/8173#pullrequestreview-2006886241 for a contrived example of how this could yield a false positive for Map.

Josh-Cena commented 2 months ago

But what about something like function gimmeEmptyArray<T>(): T[] { return []; }?

danvk commented 2 months ago

But what about something like function gimmeEmptyArray<T>(): T[] { return []; }?

Why write gimmeEmptyArray<T>() rather than [] as T[], or just []?

kirkwaiblinger commented 2 months ago

To play devil's advocate,

[] as T[]

This introduces unnecessary unsoundness with as, theoretically permitting refactoring mistakes like const s = {} as string[]; to be introduced.

or just []?

const s = [];
s.push('foo'); // TS ERROR - Argument of type 'string' is not assignable to parameter of type 'never'.

Either of the following would work, though -

const s1: string[] = [];

// note that this is basically identical to gimmeEmptyArray<string>()
const s2 = new Array<string>();

In any case, I would say my interpretation of @Josh-Cena's comment is not "but how else would you initialize an empty array?", but instead "this is a simple, easy-to-follow example of a pattern that could lead to a false positive, and some instances of which might have real world applicability". So, the question stands, of whether it's better to err on the side of false positives or false negatives.

danvk commented 2 months ago

News to me that {} as string[] is allowed. Yikes!

I see two problems with this example, though:

const s = [];
s.push('foo'); // TS ERROR - Argument of type 'string' is not assignable to parameter of type 'never'.

First, if you plug this into the playground, you'll see that TS allows it! An empty array is one of a few special patterns that trigger Evolving any behavior. The type of s after the last line is string[].

Second, if you need the type annotation for some reason, is there an argument for const s = gimmeEmptyArray<string>(); rather than const s: string[] = [];?

The only argument for gimmeEmptyArray<string>() I can think of is that it can appear in an expression:

const arr = [].concat(['a', 'b', 'c']);
//             ~~~~~~ Error: No overload matches this call.

const arr2 = gimmeEmptyArray<string>().concat(['a', 'b', 'c']);  // ok

On the other hand, why write that instead of ['a', 'b', 'c'] or [...arr] if you have another array to copy?

Josh-Cena commented 2 months ago

@kirkwaiblinger That's about what I was saying. My argument for this does not just apply to string[]; I'm thinking the more generic case with factory functions of empty containers. I don't know exactly how the rule is supposed to behave, when the empty container type uses the type argument once or many times, but I think it should behave consistently and should not have special cases for Array.

The gimmEmptyArray itself isn't too useful since you also have the array constructor. But let's imagine that it's part of a bigger collection of utilities and is offered for uniformity, or it also supports other overloads.

On the other hand, why write that instead of ['a', 'b', 'c'] or [...arr] if you have another array to copy?

I can offer several use cases for concat. Not very related to what we are talking about here though:

It works when each argument is string | string[].
It allows you to contextually check the type for each argument: if you do const arr: string[] = [...multiple, ...array, ...stuff], you don't know which one is invalid if the assignment fails, but you do with Array<string>().concat(multiple, array, stuff).

kirkwaiblinger commented 2 months ago

Off topic, but

First, if you plug this into the playground, you'll see that TS allows it! An empty array is one of a few special patterns that trigger Evolving any behavior. The type of s after the last line is string[].

I thought so, too, until I plugged it into the playground. It appears to be tsconfig dependent, with strictNullChecks alone not triggering the evolving any for some reason.

danvk commented 2 months ago

I thought so, too, until I plugged it into the playground. It appears to be tsconfig dependent, with strictNullChecks alone not triggering the evolving any for some reason.

You also need to set noImplicitAny.

danvk commented 2 months ago

I'm thinking the more generic case with factory functions of empty containers. I don't know exactly how the rule is supposed to behave, when the empty container type uses the type argument once or many times

Factory functions are definitely an interesting case. Something like:

function dict<T>(): Map<string, T> { return new Map(); }

seems fine and does save you some typing. It would be helpful if there were some way to indicate that a type parameter must be specified explicitly and not inferred, which I think is what you want in the factory function case. (NoInfer can't do this.) I guess if you fail to specify the type parameter to gimmeEmptyArray, you'll get unknown[], which is hard enough to work with that you'll hopefully get an error down the road. So maybe the rule is that a type parameter that only appears as the argument to a "singular" generic type is OK, so long as it's in the return type?

I think it should behave consistently and should not have special cases for Array.

Arrays are special in JS / TS, though. They have special syntax to construct them, tsc special-cases them in several places (e.g. evolving array types), and I believe it's impossible to create an object literal that's assignable to an array in the way that you can for other objects (see my "contrived example" in the original issue for Map). So it wouldn't be that surprising if arrays were special for this rule, too.

There are clearly concerns about arrays, but are there any objections to treating tuple types as "singular?" The idea there is to disallow code like:

declare function takeOne<T>(arg: [T]): void;
declare function returnPair<T>(): [T, string];

The empty container issue isn't relevant here, since the type of an empty tuple is [], and that can't reference a type parameter. Maybe breaking out that part of #9536 would make the change less controversial?

jfet97 commented 2 months ago

Aren't factory functions of that type instances of "return-only generics"? Shouldn't we treat them cautiously anyway? I remember one golden rule about this.

Josh-Cena commented 2 months ago

In general, yes, because they are a form of assertion, but in this case it's sound and probably necessary.

jfet97 commented 2 months ago

So this would be a safe exception to another restrictive version of this rule as well, one that would completely disallow return-only generics.

Currently if you just wrap the returned type parameter in whatever generic type you want, the rule is fine with it. It could be the case that this should always be disallowed, and situations like these would become manual exceptions anyway.

kirkwaiblinger commented 2 months ago

For reference, this is probably the section of the handbook being referenced: https://www.typescriptlang.org/docs/handbook/2/functions.html#type-parameters-should-appear-twice

danvk commented 2 months ago

So this would be a safe exception to a more restrictive version of this rule, one that would disallow completely return-only generics.

Right. function makeEmptyMap<K, V>() is safe, but function parseYAML<T>(input: string): T is incredibly dangerous because it disguises a type assertion. So returning something "empty" seems like an exception to avoiding return-only generics, but good luck telling what counts as "empty" in an automated lint rule :/

Currently if you just wrap the returned type parameter in whatever generic type you want, the rule is fine with it. It could be the case that this should always be disallowed in the return position, and cases like these would become manual exceptions.

This is stepping way, way back, but are there any general guidelines about when to accept a false positive as a limitation of a rule, and when to take it as evidence that something should not be a rule? I feel like the existing lint rules are all over the place on this.

Josh-Cena commented 2 months ago

are there any general guidelines about when to accept a false positive as a limitation of a rule, and when to take it as evidence that something should not be a rule

I don't think there are fixed guidelines. It depends on empirical evidence of whether the false positives would account for a significant portion of code people write or if such cases are "fishy" in the first place. I would say it's better to be safer here and prefer false negatives to false positives, and gradually add more error conditions over time when we feel they are well-defined enough.

JoshuaKGoldberg commented 2 months ago

So returning something "empty" seems like an exception to avoiding return-only generics, but good luck telling what counts as "empty" in an automated lint rule :/

Yeah. For example, we can't even rely on functions being single-line:

That's a confusing, generally unintuitive delineation
Sometimes folks put logs or other cannot-easily-be-identified things in there

Roughly copying some apps that track resource allocations aggressively:

function createSet<T>() {
  myFancyLogger("Making a new set...");
  return new Set<T>();
}

function createSetLabeled<T>(label: string) {
  myFancyLogger("Making a new set: " + label + "...");
  return new Set<T>();
}

We also can't be sure that allowing just the built-ins such as Array and WeakMap would be sufficient. Folks have a habit of making their own data structures and wanting to treat them the same as built-ins. MyFancyLinkedList<T> is treated as a first-class equivalent to Array<T> or Set<T> in some projects.

All that being said, I would be in favor with us adding in an opt-in, nothing-by-default option to allowlist certain types. It could use the standard format like prefer-readonly-parameter-types > allow. Then users can mention built-ins like Map<K, V> and their own fancy classes like MyFancyHashTable<K, V>.

Adding in an allowlist would let us learn from public use how users use this, and give folks an escape hatch for the surprising limitations of the rule.

danvk commented 2 months ago

It does feel like we're heading towards adding an option, doesn't it? The allow syntax looks nicely general, and I think it would work well here with one possible extension (see below).

I want to reference one of my favorite blog posts here: Why not add an option for that? If there are enhancements we can make universally, without adding an option, it will make this rule easier to use and maintain.

There are two possible improvements here that no one has identified any downsides for:

Treating tuple types as singular.
Treating array types in a parameter slot (not return type) as singular.

If nobody can think of a downside to these, then they'd be nice, option-free additions this rule. The tuple change in particular seems uncontroversial.

Assuming we are going to add something like the allow option:

We'd have to bikeshed the name, but maybe something like singularTypes.
I'd like to be able to specify { singularTypes: { from: 'lib' } }, i.e. leave the name field unspecified:

https://github.com/typescript-eslint/typescript-eslint/blob/d9dba42e742eeaee2f6cb74afc31e945479749b7/packages/type-utils/src/TypeOrValueSpecifier.ts#L13-L16

The idea would be to mark every generic type in lib as "singular." I implemented this in a branch. (I didn't know about typeDeclaredInLib, otherwise I would have used it!)

The new errors in the test file were a mix of factory functions (makeSet<T>) and unsafe, return-only generics (fetchJson). The new errors it flagged in the project I tested it on were all unsafe, return-only generics. Here were a few type signatures:

loadJson<T>(path: string): Promise<T>
queryOne<T>(db: Database, sql: SQLStatement): Promise<T | null>
readJsonFromCloud<T>(bucket: Bucket, path: string): Promise<T | null>

There were no new false positives. This feels validating and makes me think that { singularTypes: { from: 'lib' } } could eventually be a good "strict" version of this rule.

JoshuaKGoldberg commented 2 months ago

singularTypes

Similar to https://github.com/typescript-eslint/typescript-eslint/pull/8765#discussion_r1623457327, it would be really nice if we can find a name that's:

One word, not two
A word already accepted as the common term for this

...but I can't think of anything better than singularTypes. 👍 if nobody else can.

I'd like to be able to specify { singularTypes: { from: 'lib' } }, i.e. leave the name field unspecified:

👎 from me on this for now. There are a bunch of lib types that we wouldn't want this for. This issue's original premise of Array<T> is one of them. Renaming https://github.com/typescript-eslint/typescript-eslint/issues/9529#issuecomment-2218752810:

function createStack<T>(): T[] {
  return [];
}

const stack = createStack<string>();

stack.push("abc");
stack.pop(); // $ExpectType string | undefined

Similar with: Map, Set, and others.

Treating tuple types as singular.

Tentative 👍. I can't think of a code snippet where specifically [T] would be used in a way that doesn't violate this rule or need an as. Can anybody else? If not, I think we can accept this as a special-case for the rule.

loadJson<T>(path: string): Promise<T> & co.

Similarly tentative 👍. Maybe there's something around Promise.reject or similar? But I can't think of any reasonable code snippets to execute that. Can anybody else?

Josh-Cena commented 2 months ago

Treating tuple types as singular.

Treating array types in a parameter slot (not return type) as singular.

These two do sound like the greatest common factor that we can agree on.

jfet97 commented 2 months ago

@Josh-Cena I aligned my PR with these common factors, and added a test.

Relevant commit

Documentation not yet aligned, I commited this just to get some feedback

danvk commented 2 months ago

I'd like to be able to specify { singularTypes: { from: 'lib' } }, i.e. leave the name field unspecified:

👎 from me on this for now. There are a bunch of lib types that we wouldn't want this for. This issue's original premise of Array<T> is one of them. Renaming #9529 (comment):

While returning an empty array with a single-use T[] is sound, let's also remember that if your function can return any value other than [], it's unsafe. So there are both true positives and false positives.

I'm curious which (if any) interfaces in lib are intended to be used as a contextual type for object literals, similar to the ItemProps<T> case that we found with the original PR (see https://github.com/typescript-eslint/typescript-eslint/pull/8173#issuecomment-1879695591). Does typescript-eslint have anything like TypeScript's top400 / top800 test suite? It would be really interesting to get some data on how many true/false positives there are with the super-aggressive lib version of this rule.

loadJson<T>(path: string): Promise<T> & co.

Similarly tentative 👍. Maybe there's something around Promise.reject or similar? But I can't think of any reasonable code snippets to execute that. Can anybody else?

I guess it depends how broad we want to be about our interpretation of "factory function." Since TypeScript doesn't track Promise error types, Promise.reject can just return Promise<never> rather than Promise<T>. But what about:

async function emptyPromise<T>(): Promise<T[]> { return []; }
async function nullPromise<T>(): Promise<T | null> { return null; }

That second example makes me wonder about this:

function getNull<T>(): T | null { return null; }

This looks fishy. Why not just have a null return type? I think the difference between this and T[] is that null is immutable whereas T[] is not. That makes me think that returning readonly T[] could also be invalid:

// invalid: may as well return readonly never[]
function factory<T>(): readonly T[] { return []; }

JoshuaKGoldberg commented 1 month ago

which (if any) interfaces in lib are intended to be used as a contextual type for object literals

Good question. I don't know that we have a formal list; the definition kind of wavers depending on the context.

Similar with built-in collection classes: but Array, Map, Set, and their Readonly* versions are the main ones that come to mind.

anything like TypeScript's top400 / top800 test suite

No but that would be great. We've wanted to have more comprehensive end-to-end testing of downstream & upstream libraries for a while (e.g. #1752)... but haven't had time to make much progress.

That makes me think that returning readonly T[] could also be invalid

👍 agreed.

JoshuaKGoldberg commented 1 month ago

Anyway, coming over from https://github.com/typescript-eslint/typescript-eslint/pull/9536#discussion_r1708442887, I just want to confirm: what do we have consensus on? My interpretation is two things:

https://github.com/typescript-eslint/typescript-eslint/issues/9529#issuecomment-2223224801, making truly safe things as safe:
- Treating tuple types as singular
- Treating array types in a parameter slot (not return type) as singular
Adding an opt-in option for things like arrays

...where those two list items would be separate PRs (so the second would likely spin out into its own issue).

Is that right?

typescript-eslint / typescript-eslint