nodejs / node

Node.js JavaScript runtime ✨🐢🚀✨
https://nodejs.org
Other
107.55k stars 29.58k forks source link

Worker eval as ES Module #30682

Closed targos closed 1 month ago

targos commented 4 years ago

Unless I missed it, there doesn't seem to be a way of creating a Worker from an ESM source string (with eval: true).

Example that should be possible somehow:

import { Worker } from 'worker_threads';

new Worker(
  `
  import { join } from 'path';
  console.log(join('hello', 'world'));
`,
  { eval: true },
);

Related: https://github.com/nodejs/node/issues/21502

targos commented 4 years ago

Adding execArgv: ['--input-type=module'] to the options doesn't help (and wouldn't be the best UX).

lin7sh commented 4 years ago

Can I use Es6 module in Worker by using new Worker("./foo.js", { type: "module" }) Chrome 80 will enable this feature by default https://bugs.chromium.org/p/chromium/issues/detail?id=680046

addaleax commented 4 years ago

@nodejs/modules-active-members

targos commented 4 years ago

@mko-io if your file would be interpreted as a module when run with node foo.js (in a "type": "module" package, or if the extension is .mjs), yes.

jkrems commented 4 years ago

I hope we'll support type: module for workers. I assume in that case we could treat it as an implicit "also treat the source text as .mjs" (as opposed to supporting different kinds of modules in eval: true):

import { Worker } from 'worker_threads';

new Worker(
  `
  import { join } from 'path';
  console.log(join('hello', 'world'));
`,
  { eval: true, type: 'module' },
);

(Not working code, just as a possible example of what it would look like in node.)

addaleax commented 4 years ago

If we do add a type: 'module' option to the Worker constructor, should that also apply to non-eval code, i.e. be treated like a package.json type: option?

jkrems commented 4 years ago

I assume it should only import instead of require (in terms of loading semantics). Generally speaking we've been hesitant to allow the importer to change the static semantics of the imported resource. import-semantics may also imply that the "filename" is a URL (e.g. support for data: URLs).

Since we don't support bare specifiers for workers and we don't support conditional mappings of effectively "absolute URLs", I assume that type: module is mostly opting into an async worker bootstrap but not much else outside of eval.

guybedford commented 4 years ago

Do workers already support the package.json "type": "module" ok? If so then perhaps we don't need type: 'module' as an option at all actually. The major argument for that would be web compatibility but we already don't have web compatibility for this API as a goal right?

Perhaps eval can be supported via either eval: 'module' or using a data URI with a MIME type corresponding to a module like we already support in the module loader as well?

devsnek commented 4 years ago

I think we should try to match the web worker API, which uses type: classic and type: module. lacking URL.createObjectURL, combining type:module with eval:true makes the most sense to me. The collision with the "type" name in package.json is unfortunate but not a huge deal imo.

guybedford commented 4 years ago

The question is - why though?

devsnek commented 4 years ago

why work towards compat with other runtimes? it's annoying and confusing when there's a difference. we also do try to work in compat where possible, like supporting on{event} properties.

guybedford commented 4 years ago

As mentioned Worker is already not web compatible though as it uses paths not URLs and also has API differences. So since a compatibility layer is needed regardless any attempt towards web compat is just "going through the motions", while also invalidating the work we've done that already ensures module types are well-defined.

GeoffreyBooth commented 4 years ago

I guess one question is, could Worker someday be Web compatible? Like we may be lacking URL.createObjectURL right now, but it seems like something we could conceivably add someday, isn’t it? Since we have data URIs already. Likewise we’ve been batting around the idea of import from http URLs, which isn’t possible now but my understanding is that that was mostly punted for difficulty (working out the mechanics of what allowing that would mean) rather than some philosophical opposition.

Anyway my point is, if theoretically Worker might be web-compatible someday once we work out all the other incompatible pieces around it, then we might as well stay compatible whenever possible now, to avoid breaking changes later when the only parts that are incompatible are the APIs we deliberately designed differently. And I agree with @devsnek, it’s a better UX to have the same (or as similar as possible) API for Worker in both environments, if nothing else to help users remember it.

devsnek commented 4 years ago

Also this compat isn't theoretical, i've been able to run wasm threading demos in node just by creating globals of MessagePort and Worker from worker_threads.

guybedford commented 4 years ago

Here is my best shot at a single worker.mjs file that works in both browsers and Node.js:

const isNode = typeof process === 'object' && process.versions && process.versions.node;

// Give me Top-Level Await!
Promise.resolve()
.then(async () => {
  const workerThreads = isNode && await import('worker_threads');
  const fileURLToPath = isNode && (await import('url')).fileURLToPath;

  const isMainThread = isNode ? workerThreads.isMainThread : typeof WorkerGlobalScope === 'undefined' || self instanceof WorkerGlobalScope === false;

  if (isMainThread) {
    const { Worker } = isNode ? workerThreads : window;

    const worker = new Worker(isNode ? fileURLToPath(import.meta.url) : import.meta.url, { type: 'module' });

    worker.postMessage('hello');

    if (isNode)
      worker.on('message', onMessage);
    else
      worker.addEventListener('message', onMessage);

    function onMessage (e) {
      console.log(isNode ? e : e.data);
    }
  }

  // Worker
  else {
    if (isNode)
      workerThreads.parentPort.on('message', onMessage);
    else
      addEventListener('message', onMessage);

    function onMessage(e) {
      (isNode ? workerThreads.parentPort : self).postMessage((isNode ? e : e.data) + ' world');
    }
  }
});

The story works, but it took me at least half an hour to get all the details right. All in all it needs 9 branches to handle platform checks. Supporting "type": "module" was not even an afterthought in that process.

This is what I mean when I say that arguing for "type": "module" for workers as if it somehow is a compatibility issue at this point, is to ignore those 9 other conditional branches which truly do affect this workflow, while that one does not.

addaleax commented 4 years ago

@guybedford I wouldn’t call it a compatibility issue so much as a familiarity issue – the Worker API was written to match Node’s needs and differences from the Web platform, but it was also intended to be familiar to those who use Workers on the Web.

guybedford commented 4 years ago

@addaleax I am only arguing against the points made by @devsnek and @GeoffreyBooth that we have to use new Worker(..., { "type": "module" }) here in the name of browser compatibility, by clearly trying to show the API differences. I am fully behind the approach of worker_threads in that the browserify implementation of worker_threads would effectively be acting as an adapter for the browser though, exactly based on web workers API being an overall guide.

The reason I'm making this argument is because Node.js modules have a way to know the module type of every single module, which we've carefully worked out. We specially don't allow inline type information for the loader, so I don't think we should make workers an exception here. If we were to support "type": "module" for worker instantiation this introduces a new out-of-band mechanism that further complicates module interpretation, whereas the current techniques of module detection currently work perfectly fine for worker threads.

This is why my suggestion is to rely on the content-type for data URIs like we do in the module loader, or to have something like "eval": "module" specifically not to create the assumption for users that "type": "module" for workers is supported.

devsnek commented 4 years ago

using onmessage and a separate file for the worker (which is the common case on the web) you end up code that i would feel comfortable describing as portable. obviously one can write code that is not compatible (using isMainThread, addEventListener, etc) but it isn't required.

aduh95 commented 4 years ago

Related https://github.com/nodejs/node/pull/34584

dmail commented 3 years ago

I ran into this issue today and there is a "workaround" to be able to execute ESM in worker as explained in https://github.com/nodejs/node/issues/36985#issuecomment-762499232

import { Worker } from "node:worker_threads"

const worker = new Worker(new URL('data:text/javascript,import fs from "fs"'));

However the code inside the data url throws when it contains relative urls.

import { Worker } from 'node:worker_threads'

const worker = new Worker(new URL('data:text/javascript,import "./file.mjs"'));

Ideally I could do something as below:

new Worker(`
  import value from "./file.js";
  console.log(value);
`,
  { eval: true, type: 'module', url: import.meta.url },
);

I understand it's not possible today but is there something I can do with the API offered by Node.js?

It does not seems doable with vm.Module nor Loaders

aduh95 commented 3 years ago

I don't know if that fits your use-case, but you should be able to do the following:


const toDataUrl = js => new URL(`data:text/javascript,${encodeURIComponent(js)}`);
const toAbsoluteUrl = relativeUrl => JSON.stringify(new URL(relativeUrl, import.meta.url));

new Worker(toDataUrl(`
  import value from ${toAbsoluteUrl('./file.js')};
  console.log(value);
`)
);
dmail commented 3 years ago

Converting to absolute urls would work indeed! In my use case the string passed to the worker is dynamic. But I can use babel to transform and resolve all relative import specifiers found in the code 🤔 .

Good enough for me.

Edit: Actually no 😬 . There is several other cases where node throw ERR_INVALID_URL errors

new Worker(toDataUrl(`
  import value from "leftpad";
  console.log(value);
`)
);

☝️ In the code above "leftpad" would be a package declared in my package.json. But as code is executed in the context of a data url, Node.js won't find my package.json.

dmail commented 3 years ago

I guess I could resolve all specifiers with import.meta.resolve but it's experimental so I would be forced to put --experimental-import-meta-resolve in all my node commands using this technic 😬

ljharb commented 3 years ago

It also doesn’t work properly, even with the flag.

foosmithco commented 2 years ago

If ESM is unavoidable, it may be sensible to write the eval'd code to file first before calling the worker.

// pseudo-code
import {writeFile} from 'fs/promises';

await writeFile(
  './module-eval.js', 
 `import something from 'somewhere';
console.log(something.doSomething());
`,
  'utf8'
);

const worker = new Worker('./module-eval.js', {type: 'module'});

Note: This assumes top-level await support.

hax commented 1 year ago

We are now at 2023, and still no {type:"module"} support, and it waste three engineers one hour to struggle and finally I find this issue. 😒

aduh95 commented 1 year ago

And of course those three engineers are going to invest their time to contribute a fix, or they are simply going to waste everyone’s time complaining it’s 2023?

isaacs commented 1 year ago

A lot reasons to complain about it being 2023, to be fair.

RedYetiDev commented 1 month ago
new (require('worker_threads').Worker)(
  `
  import { join } from 'path';
  console.log(join('hello', 'world'));
`,
  { eval: true, type: 'module' },
);
import { Worker } from 'worker_threads';

new Worker(
  `
  import { join } from 'path';
  console.log(join('hello', 'world'));
`,
  { eval: true, type: 'module' },
);

AFAICT both these work, so I've closed this issue. LMK if I'm missing something.

targos commented 1 month ago

I'll have a look. Please don't close my issues unless there's an obvious reason (like a PR that fixes it)

targos commented 1 month ago

It also works without type: 'module'. This is unexpected.

aduh95 commented 1 month ago

It also works without type: 'module'. This is unexpected.

Not really, type: 'module' is completely ignored – it's not an option the constructor supports. The snippet works unless you run it with the --no-experimental-detect-module CLI flag.

targos commented 1 month ago

I know it's ignored. What I find unexpected is that module detection happens here (in the eval path)

aduh95 commented 1 month ago

I don't think it's unexpected (at least I don't find it surprinsing), string inputs being affected was mentioned in the original PR that added the flag (https://github.com/nodejs/node/pull/50096#issue-1932342110). However, what I find surprising is that the snippet is also broken if you pass --input-type=module in the CLI flags:

$ node <<'EOF'                    
import { Worker } from 'worker_threads';

new Worker(
  `
  import { join } from 'path';
  console.log(join('hello', 'world'));
`,
  { eval: true },
);
EOF
hello/world
$ node --input-type=module <<'EOF'                               
import { Worker } from 'worker_threads';

new Worker(
  `
  import { join } from 'path';
  console.log(join('hello', 'world'));
`,
  { eval: true },
);
EOF

node:internal/event_target:1090
  process.nextTick(() => { throw err; });
                           ^
[worker eval]:2
  import { join } from 'path';
  ^^^^^^

SyntaxError: Cannot use import statement outside a module
    at makeContextifyScript (node:internal/vm:185:14)
    at node:internal/process/execution:107:22
    at [worker eval]-wrapper:6:24
    at runScript (node:internal/process/execution:101:62)
    at evalScript (node:internal/process/execution:136:3)
    at MessagePort.<anonymous> (node:internal/main/worker_thread:167:9)
    at [nodejs.internal.kHybridDispatch] (node:internal/event_target:816:20)
    at MessagePort.<anonymous> (node:internal/per_context/messageport:23:28)

Node.js v23.0.0-pre