tree-sitter / node-tree-sitter

Node.js bindings for tree-sitter
https://www.npmjs.com/package/tree-sitter
MIT License
649 stars 114 forks source link

"TypeError: illegal invocation" when imported in multiple tests #53

Closed rien closed 1 year ago

rien commented 4 years ago

I am having a strange bug when two (or more) test files indirectly use tree-sitter, the second import throws the following error:

yarn run v1.19.1
$ jest
 PASS  src/lib/__test__/one.test.ts
 FAIL  src/lib/__test__/two.test.ts
  ● Test suite failed to run

    TypeError: Illegal invocation

    > 1 | import { default as Parser } from "tree-sitter";
        | ^
      2 |
      3 | export class Tokenizer {
      4 |

      at Object.get [as rootNode] (node_modules/tree-sitter/index.js:20:35)
      at Object.<anonymous> (node_modules/tree-sitter/index.js:16:26)
      at Object.<anonymous> (src/lib/tokenizer.ts:1:1)

Test Suites: 1 failed, 1 passed, 2 total
Tests:       1 passed, 1 total
Snapshots:   0 total
Time:        1.059s
Ran all test suites.
error Command failed with exit code 1.

I have made a minimal reproducible example at https://github.com/rien/node-tree-sitter-bug-minimal-reproducible-example with more info.

Running each test individually (so tree-sitter is only imported once) does not exhibit this behaviour.

I am writing a CLI app using typescript and using jest as testing framework.

Any idea what could be causing this issue?

issue-label-bot[bot] commented 4 years ago

Issue-Label Bot is automatically applying the label bug to this issue, with a confidence of 0.91. Please mark this comment with :thumbsup: or :thumbsdown: to give our bot feedback!

Links: app homepage, dashboard and code for this bot.

maxbrunsfeld commented 4 years ago

Can you change the minimal example to use plain javascript instead of typescript? What version of Node.js are you using? If you're using >= 12, can you try with Node 10?

maxbrunsfeld commented 4 years ago

Also, thank you for the report and for creating a minimal reproduction case!

rien commented 4 years ago

This is the minimal reproduction with JavaScript: https://github.com/rien/node-tree-sitter-bug-minimal-reproducible-example/

I'm currently using Node 10.17.

rien commented 4 years ago

I can't seem to reproduce this outside Jest. I have configured ava with more than 100 tests re-importing and successfully using tree-sitter, and everything seems to work fine. It looks like Jest could be causing this bug ...

I have made an issue at Jest: https://github.com/facebook/jest/issues/9206, maybe they know where to look for a solution.

It seems I'll switch testing framework then.

maxbrunsfeld commented 4 years ago

Whoops, I mean to comment here, but commented on the jest issue. I think this is a bug in Jest, or the way that you're using Jest. Each source file should be evaluated once, but in your case, Tree-sitter's entry-point file index.js seems to be loaded multiple times.

rien commented 4 years ago

Thank you for looking into this. I'm not familiar with how native libraries should work. If they're not meant to be loaded more than once, then there is definitely something going wrong on Jest's side.

Hocdoc commented 4 years ago

I am struggling with exactly the same issue and hope this will be fixed.

cellog commented 3 years ago

The reason this happens is becuase Jest runs a pool of child processes, and resets the state of each process for a new test. If 2 tests require tree-sitter, then it will fail with the above error. This is a bug in tree-sitter, not jest, as it should not be persisting state between runs.

cellog commented 3 years ago

The solution is to implement Language as a node addon. https://github.com/nodejs/node-addon-api/blob/main/doc/addon.md

Note the description:

Creating add-ons that work correctly when loaded multiple times from the same source package into multiple Node.js threads and/or multiple times into the same Node.js thread requires that all global data they hold be associated with the environment in which they run. It is not safe to store global data in static variables because doing so does not take into account the fact that an add-on may be loaded into multiple threads nor that an add-on may be loaded multiple times into a single thread.

The Napi::Addon class can be used to define an entire add-on. Instances of Napi::Addon subclasses become instances of the add-on, stored safely by Node.js on its various threads and into its various contexts. Thus, any data stored in the instance variables of a Napi::Addon subclass instance are stored safely by Node.js. Functions exposed to JavaScript using Napi::Addon::InstanceMethod and/or Napi::Addon::DefineAddon are instance methods of the Napi::Addon subclass and thus have access to data stored inside the instance.

It explicitly states that it is expected behavior to load the same script multiple times, as Jest is doing here. Thus #52 is definitely a step in the direction to fix this and probably all the other instabilities (crashing on running certain queries in Docker)

cellog commented 3 years ago

more context - I ported everything to N-API and it did not fix this issue. The reason is that the issue is actually in index.js - not the extension. When I modified

/*
 * Tree
 */

const {rootNode, edit} = Tree.prototype;

Object.defineProperty(Tree.prototype, 'rootNode', {
  get() {
    return unmarshalNode(rootNode.call(this), this);
  }
});

to be

/*
 * Tree
 */

const {rootNode, edit} = Tree.prototype;

Object.defineProperty(Tree.prototype, 'rootNode', {
  get() {
    if (!this.input) {
      console.log(this);
    }
    return unmarshalNode(rootNode.call(this), this);
  }
});

I got back Attempted to log "{ walk: [Function (anonymous)] }" from Jest. In the other test, I added console.log(this) and got back a full object.

basically, we can't do this pattern of extending a native module in node in a multi-threaded env.

When I eliminated the prototype overwrite, and instead used this function:

function getRootNode(tree) {
  return unmarshalNode(rootNode.call(tree), tree);
}
const node = jsParser.parse(`
const Parser = require(".");
const Javascript = require("tree-sitter-javascript");
const jsParser = new Parser();
`);
getRootNode(node).toString();

This worked great. To fix will require a breaking change to the API.

cellog commented 3 years ago

I have a fix, PR incoming

FishOrBear commented 2 years ago

provide a simple idea

if (globalThis.__module__exports__) {
    module.exports = globalThis.__module__exports__
}
else {
   //....
       globalThis.__module__exports__ = module.exports;
}
readFile("node_modules\\tree-sitter\\index.js", "utf-8", (err, str) =>
{
    if (err)
    {
        console.error("hack tree-sitter error!", err);
        return;
    }

    if (str.includes("__tree_sitter__module__export__")) return;

    str = `if (globalThis.__tree_sitter__module__export__)
    module.exports = globalThis.__tree_sitter__module__export__
else {
    ${str}
    globalThis.__tree_sitter__module__export__ = module.exports;
}`;
});

readFile("node_modules\\tree-sitter-cpp\\bindings\\node\\index.js", "utf-8", (err, str) =>
{
    if (err)
    {
        console.error("hack tree-sitter-cpp error!", err);
        return;
    }

    if (str.includes("__tree_sitter_cpp__module__export__")) return;

    str = `if (globalThis.__tree_sitter_cpp__module__export__)
    module.exports = globalThis.__tree_sitter_cpp__module__export__
else {
    ${str}
    globalThis.__tree_sitter_cpp__module__export__ = module.exports;
}`;

    writeFile("node_modules\\tree-sitter-cpp\\bindings\\node\\index.js", str, err => { });
});
jdugan1024 commented 1 year ago

Thanks for tree-sitter, it's been a joy to work with thus far.

I ran into this today. Any chance https://github.com/tree-sitter/node-tree-sitter/pull/75 could get merged, please?