Open pionxzh opened 1 year ago
I assume the issue here is related to ast-type
's parsing/handling of Scope/etc?
For my own background knowledge/context, it seems that jscodeshift
is basically a wrapper around recast
, which relies on ast-types
:
jscodeshift
is a reference to the wrapper around recast and provides a jQuery-like API to navigate and transform the AST
The transform file can let jscodeshift know with which parser to parse the source files (and features like templates).
To do that, the transform module can export parser, which can either be one of the strings "babel", "babylon", "flow", "ts", or "tsx", or it can be a parser object that is compatible with
recast
and follows theestree
spec.
Recast itself relies heavily on ast-types which defines methods to traverse the AST, access node fields and build new nodes.
ast-types
wraps every AST node into apath
object. Paths contain meta-information and helper methods to process AST nodes.
I wonder if the root issue lies in recast
, or ast-types
? Potentially related:
I haven't looked closely at the code here and how it's used, but would something like eslint-scope
help?
ESLint Scope is the ECMAScript scope analyzer used in ESLint. It is a fork of
escope
Escope (
escope
) is ECMAScript scope analyzer extracted fromesmangle
project.
Looking at the linked issue (https://github.com/facebook/jscodeshift/issues/263), it seems that a 'raw' block will lead to scope issues currently.
Example code that breaks:
const x = 42
{
const x = 47
console.log(x)
}
console.log(x)
Reviewing eslint-scope
/ escope
sounds like they may also have similar bugs:
escope
eslint-scope
I was playing around with some PoC code with eslint-scope
this morning, that you can see on the following REPL:
Running espree-eslint-scope_2.js
with the breaking example code from https://github.com/facebook/jscodeshift/issues/263 (see above) gives me the following output:
$ node espree-eslint-scope_2.js
-= SCOPE INFO =-
Scope: type=global (block.type=Program) block.id?.name=undefined implicit=x,x,console,x,console,x
Variable: x [
{ type: 'Variable', kind: 'const' },
{ type: 'Variable', kind: 'const' }
] References: 4
-= ANALYSIS RESULT =-
{
"type": "global",
"implicitGlobals": [
"x",
"x",
"console",
"x",
"console",
"x"
],
"identifiers": [
{
"name": "x",
"types": [
{
"type": "Variable",
"kind": "const"
},
{
"type": "Variable",
"kind": "const"
}
]
}
],
"childScopes": []
}
I think this may mean that eslint-scope
may also be incorrectly handling the scope of these variables; as I would expect there to be both the global scope, and the block scope.
It seems ChatGPT agrees with that:
I've opened a bug for this on eslint-scope
, but unless that gets fixed, it doesn't seem like it will be a good solution for the needs here:
Edit: Seems I was wrong about that, see my below comment for updates:
These tools may also be useful for checking how escope
handles things:
I just did another test in the REPL, this time just using @babel/parser
/ @babel/traverse
directly (not wrapped with jscodeshift
/ recast
, so not using ast-types
):
This seems to have correctly output the 2 scopes as expected:
$ node babel-scopes.js
-= Scopes and Bindings =-
Scope (uid=0, path.type=Program)
Bindings: [
"x"
]
Scope (uid=1, path.type=BlockStatement)
Bindings: [
"x"
]
I think this may mean that
eslint-scope
may also be incorrectly handling the scope of these variables; as I would expect there to be both the global scope, and the block scope.I've opened a bug for this on
eslint-scope
, but unless that gets fixed, it doesn't seem like it will be a good solution for the needs here:
It seems I was just using eslint-scope
wrong, and not parsing the ecmaScript: 6
option to it (Ref)
There are some useful docs links and discoveries in my comment on the above issue:
And I also updated my REPL to fix the code and make it work properly now:
With the very basic minimal example being:
const espree = require('espree');
const eslintScope = require('eslint-scope');
const exampleJSCode = `
const x = 42
{
const x = 47
console.log(x)
}
console.log(x)
`
const commonParserOptions = {
ecmaVersion: 2020,
sourceType: 'module', // script, module, commonjs
}
// Parse the JavaScript code into an AST with range information
const ast = espree.parse(jsCode, {
...commonParserOptions,
range: true // Include range information
});
// Analyze the scopes in the AST
// See the .analyze options in the source for more details
// https://github.com/eslint/eslint-scope/blob/957748e7fb741dd23f521af0c124ce6da0848997/lib/index.js#L111-L131
// See the following for more details on the ScopeManager interface:
// https://eslint.org/docs/latest/extend/scope-manager-interface
// https://github.com/eslint/eslint-scope/blob/main/lib/scope-manager.js
const scopeManager = eslintScope.analyze(
ast,
{
...commonParserOptions,
nodejsScope: false,
}
);
console.log('ScopeManager.scopes=', scopeManager.scopes)
See the ScopeManger
interface docs (Ref, potentially not up to date?) for more info on what objects/functions/etc eslint-scope
provides.
Babel and eslint are handling it correctly; the issue is from ast-types. The reason why I didn't use them is to maintain only one ast tool been used. We can either fix ast-types ourselves, and potentially PR back to the upstream. Using Babel or other tools for scoping is doable, but it will be hard to maintain, and there will be some performance punishment. I will check some references you provided this weekend to see what can we do.
Babel and eslint are handling it correctly; the issue is from
ast-types
@pionxzh Yup. I just wanted to test/ensure that the other suggestions I was making actually handled it properly.
The reason why I didn't use them is to maintain only one ast tool been used.
@pionxzh Yeah, that definitely makes sense. I wasn't suggesting them so much in a 'use multiple ast tools' way. More in a 'if we need to replace the current choice, what are some alternatives that handle it properly'
We can either fix
ast-types
ourselves, and potentially PR back to the upstream.
@pionxzh nods Part of the reason I was looking at some of the alternatives is that the upstream jscodeshift
issue you linked seems to suggest that recast
and ast-types
are pretty unmaintained; and even that issue hasn't had any meaningful updates in more than a year, which makes me think jscodeshift
is also equally unmaintained:
The biggest issue is with
recast
. This library hasn't really had a lot of maintenance for the last couple of years, and there's something like 150+ issues and 40+ pull requests waiting to be merged. It seems like 80% of the issues that are logged againstjscodeshift
are actually recast issues. In order to fix thejscodeshift
's outstanding issues, either recast itself needs to fix them orjscodeshift
will need to adopt/create its own fork ofrecast
to solve them.
What can be said about
recast
can probably also be said to a lesser degree aboutast-types
The biggest challenge facing
jscodeshift
(other than the documentation that everyone keeps complaining about) is 50-80% of its logged bugs are actually bugs in therecast
andast-types
libraries, both of which have a substantial number of bugs and outstanding pull requests, and both of which tend to be updated infrequently. This means there are whole classes of bugs thatjscodeshift
as a project can't do anything about. At some point, unless something changes with those projects, it's probably going to be necessary to fork both those libraries, put in PR's to upstream back to them, and hope that they'll someday be merged.
Since the last update on this issue was in August 2022, I just wanted to check in and see what the latest state of things is on this?
If
jscodeshift
is currently tightly tied to a 'rarely maintained'recast
/ast-types
; without much ability to fix bugs caused by those tools, should we also considerjscodeshift
to be in a 'rarely maintained' state? And if so, are there any suggestions as to what the best 'modern alternative' would be?
So my thinking was more in the space of "if the current underlying tools are unmaintained and buggy, what are the 'best modern alternative' choices we can use; and particularly ones that are quite heavily used by others, and so are likely to stay maintained/updated.
Using Babel or other tools for scoping is doable, but it will be hard to maintain, and there will be some performance punishment.
@pionxzh When you say hard to maintain/performance punishment; do you just mean if we were using multiple AST tools at the same time? Or do you mean by switching to one of those at all?
I will check some references you provided this weekend to see what can we do.
@pionxzh Sounds good :)
I mean it would be hard to maintain with multiple AST tools.
Yup, makes sense. I'd definitely also recommend sticking to 1 for that reason.
I'm curious, what benefits does jscodeshift
(with the babylon
/ babel
parser) / recast
/ ast-types
give us compared to just using @babel/parser
/ @babel/traverse
directly?
I asked ChatGPT, and it seemed to suggest that there is a 'higher level API' from jscodeshift
; and that recast
focuses on maintaining the original formatting/spacing of source code being modified (which since I believe we pretty print with prettier
anyway, is probably irrelevant to the needs of this project)
I'm curious whether that 'higher level API' from jscodeshift
actually simplifies things much compared to using babel's libraries directly?
Edit: For my own reference, here are the code locations currently referencing jscodeshift
, recast
and/or ast-types
:
@wakaru/unpack
using jscodeshift
with babylon
parser@wakaru/unminify
using jscodeshift
with babylon
parserI also note that there is a @wakaru/ast-utils
package that could also potentially be used to centralise some 'leaky abstractions' from these libs if we were to refactor:
And it might also potentially make sense to refactor some of these utils into that ast-utils package as well maybe?
Edit 2: Raised a new issue for the 'refactor into @wakaru/ast-utils
' potential mentioned above:
jscodeshift
provides a query-like API that is easier to find and manipulate specific AST nodes than babel/traverse
's traversal API. Adopting another tool (eg. babel/traverse) basically means rewriting everything, which is the thing that I do not want to spend time on. Currently the scoping issue is only happening on some edge cases. It's still acceptable for me. That's why I prefer to "patch" it instead of using a totally different tool.
jscodeshift
provides a query-like API that is easier to find and manipulate specific AST nodes thanbabel/traverse
's traversal API.
nods that makes sense, and I suspected it was likely going to be something like that, but I wanted to confirm my assumptions (given that the current jscodeshift
-> @babel/parser
-> recast
-> ast-types
adds a lot of extra libraries/points of potential bugs/etc, particularly given jscodeshift
/recast
/ast-types
are basically unmaintained currently)
I was basically thinking about what the 'main advantages' are that are being gained from each of the AST related libs; as that then makes it easier to think about whether there is a 'better' solution that still meets those needs/constraints.
I think jscodeshift's README provides a quite detailed description of each part of the components. 🤔
I think jscodeshift's README provides a quite detailed description of each part of the components. 🤔
@pionxzh Yeah, in terms of jscodeshift
. But I wanted to know what your personal reasoning for wanting to stick with it specifically was, and whether you cared about some of the 'strengths' that came along with the underlying libs (such as recast
's focus on not modifying unrelated whitespace/etc; which seems irrelevant for wakaru since it prettifier
s the end code anyway); which you answered for me above (simpler API + don't want to have to recode existing stuff for a new API)
The current identifier renaming is not 100% accurate. By inspecting the unpacking snapshot, you can tell that some variable was wrongly renamed to
export
orrequire
during the unpacking process. Mostly becauseast-types
are giving us wrong scope information, and it's no longer well-maintained. We need to either patch it or find an alternative.The best solution would be to fix it in the upstream. Let's track the progress at https://github.com/facebook/jscodeshift/issues/500