facebook / flow

Adds static typing to JavaScript to improve developer productivity and code quality.
https://flow.org/
MIT License
22.08k stars 1.86k forks source link

The inferred types of array literals are unsound #7222

Open lexi-lambda opened 5 years ago

lexi-lambda commented 5 years ago

As seen here, this program involving a mutated array typechecks, which is unsound:

/* @flow */
const arr = ['1', 0];
arr.sort();
const x: string = arr[0]; // unsound!

As far as I can tell, Flow treats the inferred type of arr sort of like a tuple, so arr[0] has type string and arr[1] has type number. However, it also doesn’t treat it like a tuple, since it allows .sort() and similar mutating methods to be applied to the value.

This only gets worse when I try to deduce what kind of mysterious type Flow is inferring for arr. If I use flow type-at-pos on the binding site (or any of the use sites), it claims it has inferred the type Array<string | number>, but if I add that type as a type annotation, the program (correctly) fails to typecheck, as seen here:

/* @flow */
const arr: Array<string | number> = ['1', 0];
arr.sort();
const x: string = arr[0];
4: const x: string = arr[0];
                     ^ Cannot assign `arr[0]` to `x` because number [1] is incompatible with string [2].
References:
2: const arr: Array<string | number> = ['1', 0];
                             ^ [1]
4: const x: string = arr[0];
            ^ [2]

I thought that perhaps this was a regression of #925, which caused tuple types to print as array types. However, that can’t be true, either, since this program also (correctly) fails to typecheck, since .sort() is (rightfully) illegal on tuples, as seen here:

/* @flow */
const arr: [string, number] = ['1', 0];
arr.sort();
const x: string = arr[0];
3: arr.sort();
       ^ Cannot call `arr.sort` because property `sort` is missing in `$ReadOnlyArray` [1].
References:
2: const arr: [string, number] = ['1', 0];
              ^ [1]

Maybe the typechecker is tracking some refinements on individual array elements in a way I do not understand, which would not be reflected in the type reported by type-at-pos, but even if this is true, then it is still a bug, since .sort() should invalidate those refinements.

lexi-lambda commented 5 years ago

I went digging through the Flow source code for some answers, just out of curiosity, and I think I have a little more understanding about what’s going on. However, disclaimer: I know essentially nothing about how Flow actually works, and this is my first time looking at any of Flow’s code at all, so this could be totally off the mark.

Anyway, I found the place where the type of array literals appears to be inferred, which comes down to these lines here:

https://github.com/facebook/flow/blob/147eb3fe43b407fb70506b750d73734e49241b05/src/typing/flow_js.ml#L10631-L10632

Notably, literals are handled specially, since they get both an element type and a tuple type, whereas non-literal arrays only get an element type. In a sense, it seems like this is a little bit like an ad-hoc intersection, so that the type can be used both ways, absent a type annotation to disambiguate user intent. Of course, it isn’t actually an intersection type, and in particular, the way it’s handled is unsound!

(I imagine much of what I said above is obvious to some or all of the Flow maintainers, but it certainly wasn’t obvious to me! So consider the above a little bit of thinking out loud.)

I don’t know enough about how Flow actually works to have any idea what the right fix is, but it seems to me like the status quo is unacceptably unsound (much more so than the unsoundness that arises from out-of-bounds array accesses). As a random guess, it seems like perhaps the extra tuple information ought to be dropped if any non-read-only methods are invoked on the array, but I don’t know if that actually means anything in the context of how Flow handles type refinements.

As a final note, I found it extremely confusing that Flow infers a type that can’t be printed using type-as-pos because there is no surface syntax to actually write the type down! I guess maybe that’s just a fact of life for Flow, since the internal type system tracks a lot more information than is made visible in the source syntax, but I wish Flow at least had some way to communicate to me that the type it has printed is imprecise.

lexi-lambda commented 5 years ago

One more thing: it looks like this issue has been previously reported in #4702, but I think I’ll still leave this issue open for now, since I think it adds some extra context that the linked issue does not include (and includes the scary word “unsound”, so maybe it will capture just a little more attention 😜).

jbrown215 commented 5 years ago

Thank you for your analysis and debugging here. I'm not sure when we'll get around to fixing this, but the level of detail you went into here will make it much easier for whoever picks this up later.

eilvelia commented 5 years ago

Another example:

const a: [number, string] & Array<number | string> = [0, ''] // it' ok

function f(b: [number, string]) {
  b[0].toFixed(2)
}

function g(b: Array<number | string>) {
  b.pop()
  b.pop()
}

g(a)
f(a)

const a: [number, string] & Array<number | string> = [0, '']

is unsafe, only

const a: [number, string] & $ReadOnlyArray<number | string> = [0, '']

should be allowed.