Open rzvxa opened 2 months ago
Since you are heavily invested in the scopes right now, @Dunqing What do you think?
Funny, I was wondering how we handle arguments
just this morning. Sounds like we don't.
I can see 3 options:
arguments
binding to all function scopes eagerly.arguments
binding to function scopes lazily, only once a reference to it is found in function body (or params).arguments
so IdentifierReference
for it would have no SymbolId
.Personally, I would tend towards (2) - simple, and avoids perf hit of (1).
NB: arguments
is almost like an invisible function param, but there is one oddity in sloppy mode:
function f1(x) {
// SyntaxError: Identifier 'x' has already been declared
let x;
}
function f2(arguments) {
// SyntaxError: Identifier 'arguments' has already been declared
let arguments;
}
function f3() {
// No problem
let arguments;
}
Also, you can have 2 bindings for arguments
in a function:
function f(x, y = arguments) {
let arguments = 456;
console.log(y[0]); // Logs 123
console.log(arguments); // Logs 456
}
f(123);
So we'd need an extra scope for function body.
I think the option 3
isn't a really good approach here. I'm torn between 1
and 2
.
The latter seems like a good thing for performance but I'm afraid it would cause issues later on because javascript is whimsical at times:
function f() {
eval("console.log(arguments)");
}
f("a", "b", "c")
With the second approach, we still have the same issue of shadowing arguments and generating the wrong code!
He had to go and bring eval()
into it... 😄
But even then, I can't actually see how that can go wrong. eval()
will (I assume) prevent mangling names of all bindings which it can "see". And we can just ban creating new bindings called arguments
in the transformer (that's illegal in strict mode anyway).
He had to go and bring
eval()
into it... 😄
😆
But even then, I can't actually see how that can go wrong.
eval()
will (I assume) prevent mangling names of all bindings which it can "see". And we can just ban creating new bindings calledarguments
in the transformer (that's illegal in strict mode anyway).
You are right that we shouldn't ever mangle arguments
if it isn't user-defined. I'm just saying since arguments are a context-sensitive identifier and javascript allows for weird ways to access variables through non-conventional ways it might cause issues further down the road.
Maybe no sane person uses new Function
or eval
in the production code, But javascript frameworks might generate such code and expect us to bundle it correctly for them.
I might be overthinking this.
We should add an argument
symbol to the function scope. Both TypeScript
and typescript-eslint
do the same thing
What do you think of this example?
// Sloppy mode
function f(x, y = arguments) {
let arguments = 456;
console.log(y[0]); // Logs 123
console.log(arguments); // Logs 456
}
f(123);
This demonstrates there can be 2 distinct bindings for arguments
in a function. arguments
in y = arguments
references a different binding from let arguments = 456
.
I was reading the eslint-scope tests today when saw an oddity I couldn't figure.
It contains arguments
in the global scope. Your comment made me go and look it up, Turns out they always fake arguments
in all non-function scopes. Here's my educated guess about how it is done there:
close
on the referencer
.super.__close
at the end of global scope closing.arguments
would be registered as implicit/builtin/global/dangling identifier.With these steps, your code would work similarly to this:
let arguments = [123];
function f(x, y = arguments) {
let arguments = 456;
console.log(y[0]); // Logs 123
console.log(arguments); // Logs 456
}
f(123);
Treating arguments
as external to the function just seems weird. And it can't handle the case where there's another arguments
binding in scope above e.g.:
// Sloppy mode
let arguments;
function f(x, y = arguments) {
let arguments = 456;
}
Here we have 3 bindings for arguments
, and only 2 scopes to put them in. I know this is an edge case, and will be very rare in practice, but Oxc aims for absolute correctness.
Only solutions I can see are:
arguments
in function params.arguments
as a special case and don't create a binding for it.(1) is simpler and more conceptually correct, but imposes quite a large penalty. Every function generates 2 scopes, and functions are a common construct, so it will slow down symbol resolution significantly. (3) has been rejected. So I would lean towards (2).
I personally would prefer to find a way to avoid unnecessary scopes, Can't we declare arguments as a hoisted variable on the function scope? with that, we should be able to redeclare it in the function body.
But that's just my preference and I have nothing against your approach here, It just feels wrong to add scopes left and right as a means to simplify stuff. As you said correctness is more important. I'm not a huge fan of premature optimization.
I just have a quick question. We are wrapping the whole function in the scope right now. Is this going to change? Is the scope going to be entered after the parameters?
To clarify, what I'm saying about option (2) above is:
arguments
to that scope.arguments
is referenced in the function's params, create a further scope for function body. This will be very rare.i.e. Handle the weird uncommon edge case as a de-opt. No overhead from extra scope in common case, but maintain complete correctness.
To clarify, what I'm saying about option (2) above is:
* By default, only create 1 scope for the function (as now). * Add binding for `arguments` to that scope. * _Only if_ `arguments` is referenced in the function's params, create a further scope for function body. This will be very rare.
i.e. Handle the weird uncommon edge case as a de-opt. No overhead from extra scope in common case, but maintain complete correctness.
Actually, it's not as bad as I initially thought. I'm just afraid of something like this.
When you are accessing the Ancestors
of a scope you have to check if that scope is a legit or a fake one to prevent issues in places when we want to get the "actual" parent with something like scope.ancestors(xxx).nth(1)
or scope.get_parent_id(xxx)
.
It also means that there are now some really obscure edge cases with arguments
that won't get tested, It is not just about parent scope, With this approach, you'd have some identifiers that are pointing to the parent of a function's body but aren't the actual scope of the enclosing node. I feel it would cause a lot of confusion and heisenbugs for both us(especially contributions that don't get reviews by people who are aware of this) and our crate users.
What are your thoughts on this? Even if we do all of these checks doesn't it mean we won't get much performance out of this in the long run(as we start to use scope more and more)?
I've checked the spec and even there, They only create a new lexical environment if strict is false to prevent eval
from creating a new binding. Otherwise, they would always use the lexical environment of the function's callee. I know these aren't apple to apple comparisons as environments are a runtime concept and aren't 1:1 with scopes. But it might help.
But maybe I'm missing something here.
https://tc39.es/ecma262/multipage/ordinary-and-exotic-objects-behaviours.html#sec-functiondeclarationinstantiation https://tc39.es/ecma262/multipage/ordinary-and-exotic-objects-behaviours.html#sec-arguments-exotic-objects
I found out how TypeScript handles arguments
. TypeScript
adds an arguments
symbol but does not add it in scopes. https://github.com/microsoft/TypeScript/blob/d8086f14b6b97c0df34a0cc2f56d4b5926a0c299/src/compiler/utilities.ts#L11361-L11380
I found out how TypeScript handles
arguments
.TypeScript
adds anarguments
symbol but does not add it in scopes. https://github.com/microsoft/TypeScript/blob/d8086f14b6b97c0df34a0cc2f56d4b5926a0c299/src/compiler/utilities.ts#L11361-L11380
Nice find. I also see tsc defines
globals: SymbolTable;
argumentsSymbol: Symbol;
requireSymbol: Symbol;
TypeScript
adds anarguments
symbol but does not add it in scopes.
TS may do it, but it seems like a bad idea to me. Rather confusing, no?
For the record, I added this to the mangler
arguments
can be renamed if it's a local const
, let
, or function declaration declared in function scope.
// Sloppy mode
function outer() {
let arguments = 123;
console.log(arguments);
}
function outer2() {
function arguments() {}
console.log(arguments);
}
But not if it's a var
, or hoisted function declaration:
// Sloppy mode
function outer() {
console.log(arguments); // Logs [1, 2, 3]
var arguments = 456;
console.log(arguments); // Logs 456
}
outer(1, 2, 3);
function outer2() {
console.log(arguments); // Logs [1, 2, 3]
{
function arguments() {}
}
console.log(typeof arguments); // Logs 'function'
}
outer2(1, 2, 3);
Personally I'm in favour of adding scopes for function bodies anyway for other reasons (#5017), so that'd resolve some of the questions above.
Related to this: https://github.com/oxc-project/oxc/issues/3374#issuecomment-2223483123
We do not keep track the arguments object in functions, It causes false positives in linters as of now and it can impose more serious limitations or errors in other crates(transformers and minifier come to mind as we might shadow the function arguments by mistake).