Open trusktr opened 4 years ago
No problem. On the contrary, thank you for so many issues. Actually, I described it in detail in docs why do we need this behavior. Thank you for the issues ^_^.
Say, in my project which is soundly type checked by Hegel I'd create a user space type MyVector
and allow only the operations not leading to a sparse array. What is not currently possible (AFAIK) in the user space is expressing the connection between i
and the nums.length
. On the other hand, the simple use of the for of
loop might help avoiding the mess with i
🙃
const numbers: Array<number> = [1, 2, 3];
let intSequence = "";
numbers.length = 5;
for (let i = 0; i < numbers.length; i++) {
// TypeError: Cannot read property 'toFixed' of undefined
intSequence += numbers[i].toFixed();
}
This example actually has the wrong behavior.
Assignment to numbers.length
should change the type of numbers
to Array<number | undefined>
Changing the length
field is turning a normal array into a "holey array"
A for loop should "just work" TM without an if check in the body, if the array type is "refined" into the larger type of Array<T | undefined>
upon assignment of length
aka "continous array" => "Holey array" then you can still make for
loops work.
someArray.length = 0
You can make an exception that assigning the NumberLiteral 0
to .length
does not convert it into a "holey array" since it truncates to an empty array of type Array<number>
@Raynos you never know where in the code this assignment may happen, it may happen not only in current scope. That's why we need that check
This is correct. An array can be mutated in the body of a for loop which invalidates safe access.
I guess bounds checking is hard :( ( https://go101.org/article/bounds-check-elimination.html ).
There is no way to avoid this check, unless we go into the realm of dependent types
Would the following be ok ?
for (let i=0; i<nums.length; i++) {
const n = nums[i] // must be a number
// Arbitary statements that do not reference the `nums` identifier.
}
Aka a loop from 0 to len and the only reference to nums
in body of the loop is nums[i]
Any other method call on nums
would invalidate type nums[i]
is known to not be undefined
.
I guess that does require lookahead in the type checker and a two pass check over the body of the loop
It's a real shame javascript does not have for (const value, index of nums)
syntax -.-
Sometimes you need access to the index of the loop and doing for (const [index, value] of nums.entries())
allocates a temporary array and has destructuring overhead.
I guess the same behavior
The bounds check is one issue, but there is another deeper one: the “holey” arrays. For a JS developer a value of type ‘Array’ can always have holes in it. I always expect hidden ‘undefined’s to be there while not ever checking this case in the my everyday programming life 🤦♂️😅
Would it be useful to have two inbuilt types Array
and SparseArray
to represent this ?
Would it be useful to have two inbuilt types
Array
andSparseArray
to represent this ?
I also like the idea of having two distinct types here. Following the rule of the least surprise I'd say that the Array
type should stay sparse by default and a more refined NonSparseArray
(or Vector
) can be a nice addition to the standard type library.
Sounds good. I think we can implement a safe Vector
type which will give you the ability to use the Array
as a fixed-length collection (like $Immutable<Array<unknown>>
, but give you an option to set value in defined positions.
What do you think?
A fixed length array is rarely used in javascript
The amount of usage of const arr = new Array(MY_SIZE)
vs const arr = []
shows that 99% of the time people use []
and let the JS engine autogrow the array upon insertion.
No. I talk about the next thing.
const vector: Vector<number, 10> = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
vector.pop(); // Error
vector.length = 4 // Error
vector[0] // number
vector[0] = 4;
vector[11] // Error
vector[11] = 4; // Error
I guess fixed size arrays are nice for constants but that only solves the loop issue for fixed size arrays.
How do you for loop over a dynamic sized array ? aka HttpAgent#freeSockets
Also Vector
reminds me people of std::Vector
in C++ ( http://www.cplusplus.com/reference/vector/vector/ ). Might be confusing name for a fixed length array.
Vectors are sequence containers representing arrays that can change in size.
Probably not the best name. The only good example of a fixed size array in JS is ArrayBuffer
( https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/ArrayBuffer ).
Following the rule of the least surprise I'd say that the Array type should stay sparse by default
How many people actually use sparse arrays? I've never used one, and have never encountered it, and have never had a runtime error from it (maybe I've gotten lucky).
Personally I'd prefer to have non-sparse Array
and make array.length = anythingOtherThanZero
a type error. We don't have to hold JS user hands, they can learn the difference. After all, we're not changing JS behavior, only restricting how we use it; that's what Hegel's goal is so far.
Vector type which will give you the ability to use the Array as a fixed-length collection
I'm sure some people would want to use it, those into immutable libs for example. It just needs a different name than Vector
to not confuse with std::vector
which is explicitly meant to be a variable-length list.
It would certainly make static analysis of the loop easier in those cases.
I think most people would like a solution that works with variable-length arrays.
It is in how we modify the length that things can go wrong. If we use array.push(n)
where then n
is type checked, there's no problem if it increases the length... outside of a loop.
The problem is what about modification of the array length (f.e. splicing out items) inside a loop? Fixed-length arrays would make it easy to know this doesn't happen.
For variable-length arrays, maybe the type checker can detect whether or not length modification happens in the loop, then A) if confidently detected that not, require no refinement, otherwise B) require refinement?
Example:
const a = [1,2,3]
a.push(4) // ok
a.length = 50 // would be a type error
let n2: number = 0
for (const n of a) {
n2 = n // ok
console.log(n2)
}
for (const n of a) {
// reduces the confidence of a.length not being modified to less than 100%
someFunction(a)
n2 = n // type error, number | undefined is not assignable to number
console.log(n2)
}
for (const n of a) {
// explicitly taints confidence of a.length not being modified
a.splice(...)
n2 = n // type error, number | undefined is not assignable to number
console.log(n2)
}
const n is a const variable. It’s type is literally constant due to assignment being an error.
It’s only array index accessor that can have an out of bounds access returning undefined or return a hole
In for-of loops, const n
is recreated on each iteration, and it can contain a hole, therefore the type of const n
in for (const n of a)
is number | undefined
. So because n2
is type number
, we can not assign n
to it in the second and third for-of examples due to certainty of a not containing holes being lost (number | undefined
can not be assigned to number
in that case).
In the first for-of loop, if we cut out the other stuff,
const a = [1,2,3]
a.push(4) // ok
let n2: number = 0
for (const n of a) {
n2 = n // ok
console.log(n2)
}
we know with certainty that there are no holes in the array, so in that case const n
is number
.
const n
is the result of what is effectively the same thing as an index access. You can see this logs undefined
in your console:
let a = [1,,3,4]
a.length = 5
for (const n of a) console.log(n) // logs: 1, undefined, 3, 4, undefined
What I'm saying is, it is possible to statically analyze the following:
const a = [1,2,3]
a.push(4)
let n2: number = 0
for (const n of a) {
n2 = n // ok, no error
}
const a = [1,2,3]
a.length = 50
let n2: number = 0
for (const n of a) {
n2 = n // type error, number | undefined is not assignable to number
}
const a = [1,2,3]
a.length = 2
let n2: number = 0
for (const n of a) {
n2 = n // ok, no error
}
const a = getAnArrayFromSomewhere()
a.length = 2
let n2: number = 0
for (const n of a) {
n2 = n // type error, number | undefined is not assignable to number
}
const a = [1,2,3]
let n2: number = 0
for (const n of a) {
n2 = n // ok, no error
}
const a = [1,2,3]
let n2: number = 0
for (const n of a) {
someFunction(a)
n2 = n // type error, number | undefined is not assignable to number
}
In some cases, if we statically analyze someFunction
or getAnArrayFromSomewhere
, then we may know that the array must not have holes. For example:
function someFunction(a: Array<number>) {
a.push(42)
}
const a = [1,2,3]
let n2: number = 0
for (const n of a) {
someFunction(a)
n2 = n // ok, no type error
}
vs
function someFunction(a: Array<number>) {
a[Math.floor(Math.random() * 100)] = 42
}
const a = [1,2,3]
let n2: number = 0
for (const n of a) {
someFunction(a)
n2 = n // type error, number | undefined is not assignable to number
}
or
function getAnArrayFromSomewhere() { return [1,2,3] }
const a = getAnArrayFromSomewhere()
a.length = 2
let n2: number = 0
for (const n of a) {
n2 = n // ok, no type error
}
or
function getAnArrayFromSomewhere() { return [1,2,3] }
const a = getAnArrayFromSomewhere()
a.length = 50
let n2: number = 0
for (const n of a) {
n2 = n // type error, number | undefined is not assignable to number
}
We can see that in all those cases, we know with certainty whether an iteration will get holes or not.
I didn’t know a for of loop includes undefined for a holey array.
That means both for and for of loops both have x or undefined behavior.
Good catch
I had no idea forEach
skipped holes. :laughing: That's neat! After all these years, it never mattered (or I wasn't aware that it mattered).
Maybe Hegel could check loop headers, and if the iteration variables are written properly, then it can narrow the types of array access to non-
undefined
.For example, in the code
it will not go out of bounds, so there is no check needed like
n != undefined && useNum(n)
.Sorry for opening so many issues! There are lots of ideas. :)