editorial proposal: never change type of aliases

michaelficarra commented 3 years ago

I propose that wherever we currently re-assign an alias in algorithm steps where that assignment changes the type of value (type is a loose term here, but I think we have an intuitive understanding of which values are similar and which are different), we instead introduce a new alias. This is especially useful in scenarios where the alias is introduced in the parameter list of an AO and given a type in the summary. A reader who is looking at the declared type would probably expect the alias to have that type at any point in the algorithm.

ljharb commented 3 years ago

Can you share some examples where this doesn't already hold?

michaelficarra commented 3 years ago

Unfortunately it's going to be pretty difficult for me to find these occurrences. What prompted the issue was @bakkot and I being momentarily tripped up by one case when pairing yesterday. @bakkot Do you happen to remember that case? Maybe if we use ecmarkup to find assignments to aliases that were declared in the parameter list, we'd find at least the latter kind. No idea how we'd generally find type mismatches without significantly more advanced tooling.

Anyway, what I'm looking for here is agreement that these cases should be considered an editorial bug and should be fixed whenever we notice them.

devsnek commented 3 years ago

Hypothetically:

Let data be getData() // bytes
Set data to decode(data) // value

Does this fall into "similar" values?

ljharb commented 3 years ago

I feel like in a general sense I agree it might be better to use aliases in cases like these, but I'm not comfortable committing to do so in advance.

jmdyck commented 3 years ago

I propose that wherever we currently re-assign an alias in algorithm steps where that assignment changes the type of value (type is a loose term here, but I think we have an intuitive understanding of which values are similar and which are different), we instead introduce a new alias.

This is probably a good idea in general, but I think it might be difficult to pin down the specifics. Consider something like this:

1. Let _x_ be <something>.
1. If _x_ is not a Foo, then
    1. Set _x_ to <some Foo value>.
1. (use _x_).

Within the scope of the then, alias _x_ has type not-a-Foo, but then the Set changes it to have type Foo, so that would seem to qualify as a change of type. Now, you could introduce a new alias:

1. If _x_ is a Foo, then
    1. Let _foo_ be _x_.
1. Else,
    1. Let _foo_ be <some Foo value>.

or possibly:

1. Let _foo_ be _x_ if _x_ is a Foo, else <some Foo value>.

and that would be better in some respects, but it might be disruptive to apply everywhere, and I'm not sure it's what you had in mind.

E.g., instead of the question:

Is the type of _x_ after the Set different from its type before?

the question might be:

Is the type of _x_ after the Set different from the type it had initially?

This is especially useful in scenarios where the alias is introduced in the parameter list of an AO and given a type in the summary. A reader who is looking at the declared type would probably expect the alias to have that type at any point in the algorithm.

Yes, the question is easier for parameters, because those have (or will eventually have) a declared 'type' in the preamble and/or an assertion at the start of the algorithm, so you can ask if a Set changes an alias so that it no longer adheres to that 'type'/assertion.

(I'd go so far as to say you simply shouldn't Set parameters, but that too would probably be disruptive.)

Jack-Works commented 3 years ago

I also suggest introducing a new "const" spec variables to inform the reader that this variable never changes. This can reduce mental burden when reading a long algorithm, (for example, use a different color to mark "mutable" spec variables)

jmdyck commented 3 years ago

Can you share some examples where this doesn't already hold?

One example I just noticed is in PR #2007, in BigIntBitwiseOp, where the preamble 'declares' parameter _x_ to be a BigInt, but then step 2 of the algorithm says: 1. Set _x_ to the mathematical value of _x_. which changes the type of _x_ from BigInt to mathematical integer.

michaelficarra commented 3 years ago

Discussed in the editor call today:

We should never assign to parameters, so the declared type will never be violated.
We will audit for instances of type-changing assignments and try to convert to SSA
1. If we find that there is no need for type changes, we will introduce a lint rule
2. If we find that there are valid uses, we will just make as many of the changes as is reasonable

jmdyck commented 3 years ago

1. We should never assign to parameters

I count about 100 violations of this in the current spec.

About 38 are of the form: If _optionalParam_ is not present, set _optionalParam_ to <default value>.

And about 20 are of the form: Set _param_ to Coerce(_param_). where Coerce is

RequireObjectCoercible
ToBigInt
ToBoolean
ToIndex
ToNumber
ToString
ℝ

So you might want to decide how you want to handle those cases before going after the rest (which don't appear to form any broad categories).

waldemarhorwat commented 3 years ago

Here's one that tripped me up for a while:

StatementList : StatementList StatementListItem

1. Let sl be the result of evaluating StatementList.
2. ReturnIfAbrupt(sl).
3. Let s be the result of evaluating StatementListItem.
4. Return Completion(UpdateEmpty(s, sl)).

Here it looks like the [[Value]] slot of the completion record in step 4 is itself a completion record sl. It isn't because the type of sl is quietly changed in line 2 without any visible assignment to sl.

brabalan commented 3 years ago

In our ongoing formalization, we write such cases as

let sl =%returnIfAbrupt sl in
…

which is syntactic sugar for

returnIfAbrupt sl (fun sl => …)

The monadic binder returnIfAbrupt takes a completion record, it then extracts the [[Value]] slot if it is a normal completion to use it as argument of the continuation. It propagates the record otherwise, ignoring the continuation.

(Note that our completion records are polymorphic in their [[Value]] slot.)

tc39 / ecma262

editorial proposal: never change type of aliases #2178