Open lucacasonato opened 3 years ago
I have discussed with @bnoordhuis, @piscisaureus, and @bartlomieju and we did not find any reasonable solution. V8 does not provide a way to mark code as "internal" for the purposes of automatic microtask checkpoints, and manual microtask checkpoints are not possible because the naive approach does not work as described above, and a more complex approach where we keep track of if there are user code frames on the stack is going to be too slow or too difficult to implement.
The naive approach to fixing this is to just force V8 to run a microtask checkpoint whenever user code that was called by "internal JS" returns execution back to that internal JS. For example, we could microtask checkpoint after every event handler insider a dispatchEvent is called.
That's not correct. Nowhere does the spec say to "run microtasks" as part of dispatchEvent()
.
Instead, the FileReader spec says to "queue a task to fire a progress event named loadstart
". The important bit here is "queue a task": this task is put on a queue, and an event loop continually runs through all tasks. After executing each task, the event loop must perform a microtask checkpoint, in other words: run all microtasks.
It's not so much about distinguishing between "internal JS" and "user-land JS". User-land code can call into internal code, which calls into more user-land code, etc. This has nothing to do with microtasks.
It's really about making sure that each task runs on its own, and that you always run microtasks after each task. It shouldn't matter how many other tasks are still queued after running the "dispatch a loadstart event" task.
So how do we solve this? Instead of naively performing a microtask checkpoint every time some user JS code returns to the internal JS code, we instead perform that checkpoint only if there is no user js code on the stack. If the internal JS code is sandwiched between user JS code, no microtask checkpoint would be performed. The difficulty is figuring out if we are sandwiched or not in a performant way.
While that might work, it's really complicated and it's not even aligned with the specification. The spec doesn't even mention "the JS stack", it just says: "perform a microtask checkpoint after running every queued task". So ideally, Deno needs to support the concepts of a "task", a "task queue" and an "event loop".
In pseudo-code, this would look something like this:
let taskQueue = [];
let microtaskQueue = [];
let runningMicrotasks = false;
function queueTask(callback) {
taskQueue.push(callback);
}
function queueMicrotask(callback) {
microtaskQueue.push(callback);
}
function runEventLoop() {
while (taskQueue.length > 0) {
const task = taskQueue.shift();
task();
runMicrotasks();
}
}
function runMicrotasks() {
if (runningMicrotasks) return;
runningMicrotasks = true;
while (microtaskQueue.length > 0) {
const microtask = microtaskQueue.shift();
microtask();
}
runningMicrotasks = false;
}
(In practice, tasks would be handled by run_event_loop
in Rust and microtasks would be handled by scope.perform_microtask_checkpoint
. No need to re-invent the wheel. 😛)
The spec doesn't even mention "the JS stack", it just says: "perform a microtask checkpoint after running every queued task".
Not in the event loop processing section, but the relevant algorithm is "clean up after running script". In the FileReader
case, the load
and loadend
events are fired from the same event loop task (10.5 in "read operation"), so the only reason microtasks are being run between those events in browsers is because the JS stack has emptied. I don't understand the case with streams as much, but since the streams spec doesn't queue a task in the event loop at any point, using only microtasks, it seems like it's much the same.
That said, a lot of Deno's internal code use promises and async
/await
, with the result that a lot of what the specs say should be tasks are instead microtasks. I put together a prototype of a proper implementation of the event loop based on macrotasks, and that did fix the ordering of event and microtasks in the loadstart
and progress
events – because those are fired on two different tasks in the spec. But it didn't do anything for the load
and loadend
events, or for streams.
It doesn't look like "clean up after running script" applies here, since we're not evaluating new JavaScript code here. Instead, "clean up after running a callback" applies, but that doesn't perform a microtask checkpoint.
But you're right, it's odd that FileReader
fires two events in a single task. I assume that would mean that microtasks will only run after the loadend
event, but that's not what browsers do. Quite confusing... 😅
Indeed, the main problem appears to be Deno using microtasks in places where it should be using regular tasks. I hope some progress can be made on this front. 🙏
It doesn't look like "clean up after running script" applies here, since we're not evaluating new JavaScript code here. Instead, "clean up after running a callback" applies, but that doesn't perform a microtask checkpoint.
But you're right, it's odd that
FileReader
fires two events in a single task. I assume that would mean that microtasks will only run after theloadend
event, but that's not what browsers do. Quite confusing... sweat_smileIndeed, the main problem appears to be Deno using microtasks in places where it should be using regular tasks. I hope some progress can be made on this front. pray
Firing an event ultimately ("inner invoke", step 10) calls Web IDL's "call a user object operation", which prepares and then cleans up both script and callback. I don't really understand why this has to happen, but it's not a browser bug.
Ah, I see. Thanks for clarifying. 🙂
So according to Web IDL: if the JavaScript execution stack is empty after running a user's callback, then it should immediately run microtasks. That also explains why adding two event listeners for the same event type runs microtasks after every event listener, even though they're all running within the same "dispatch an event" task.
That does make things more difficult for Deno, since it's using JavaScript for its internal logic too. I suppose Deno could implement prepare to run script and clean up after running script by incrementing and decrementing a global counter, to keep track of the number of "user code stack frames" we're currently in. Of course, that means updating all of the places where Deno calls into user code... which I assume is a ton of work. 😬 But it's better than stack inspection, right? 😅
In browsers, a microtask checkpoint is performed whenever the JS stack empties. Take the following example:
In all current browsers,
FileReader
is implemented as a native object. That means when you callfilereader.readAsText
, the following happens:filereader.readAsText
, JS calls into native C++ code and queues some tasks (things that should happen on future spins of the event loop) to emit theloadstart
,progress
,load
, andloadend
events in that order. The C++ code returns back into JS.queueMicrotask
call pushes one microtask onto the microtask queue.after start microtask
is printed to the console. The microtask queue is now empty.loadstart event
is logged, and a microtask is queued.loadstart microtask
is printed to the console. The microtask queue is now empty.This is how the browser console looks after executing this:
Now lets see what happens in Deno. We currently also follow the rule that a microtask checkpoint is performed whenever the JS stack empties. In Deno
FileReader
is implemented in JS code:filereader.readAsText
, JS calls into JS code and queues some tasks emit theloadstart
,progress
,load
, andloadend
events in that order. The JS code returns back to the calling JS code.queueMicrotask
call pushes one microtask onto the microtask queue.after start microtask
is printed to the console. The microtask queue is now empty.loadstart event
is logged, and a microtask is queued. At this point there is still something on the JS stack, so a microtask checkpoint is not performed.loadstart microtask
,progress microtask
,load microtask
, andloadend microtask
are printed to the console. The microtask queue is now empty.The console looks like this:
The problem is that for purposes of V8s automatic microtask checkpointing, which happens when the JS queue empties, there is no distinction between JS code that is part of Deno's internal implementation, and user code. For microtask checkpointing to work correctly, this needs to be fixed.
The naive approach to fixing this is to just force V8 to run a microtask checkpoint whenever user code that was called by "internal JS" returns execution back to that internal JS. For example, we could microtask checkpoint after every event handler insider a dispatchEvent is called.
This has a problem though, that I can demonstrate with the below example:
With our naive approach, we would dispatch event foo, call the event handler for foo, which will print "foo event", and then perform a microtask checkpoint which causes "foo microtask" to be printed. The same would happen for the "bar" event:
If you look at what this does in the browser however, you will see the following printed:
Here you can see that our naive approach broke down. When a user calls
dispatchEvent
there is still some JS on the stack. The internal JS code directly calls the event handler. The internal code is sandwiched between two pieces of user code. In this case we do not want to perform a microtask checkpoint when the user JS code returns to the internal JS code, because there is still something on the JS stack (thedispatchEvent
call). Because of this the microtask checkpoint is deferred until after both event handlers have been called, and there is no more JS to be executed.So how do we solve this? Instead of naively performing a microtask checkpoint every time some user JS code returns to the internal JS code, we instead perform that checkpoint only if there is no user js code on the stack. If the internal JS code is sandwiched between user JS code, no microtask checkpoint would be performed. The difficulty is figuring out if we are sandwiched or not in a performant way.