Improve main cpu loop performance

gfeun commented 4 years ago

Working branch for #18

Benchmark on Demo

baseline: setTimeout(resolve, 0))

Function	Performance change
window.requestAnimationFrame(resolve)	+ ~10% performance on chrome
setZeroTimeout(resolve)	+ ~20% performance on chrome

Perf on Firefox is unchanged, which is quite disappointing

Profiling

I tried to do some profiling to spot differences between Firefox and Chrome

My main discovery is that on Chrome, "Micro tasks" each take about ~20ms to run. On Firefox "Dom Events" (which i think refers to the same thing as Chrome Micro tasks" take about ~100ms so about 5 times more.

This may be a coincidence but i have ~150% simulation time on chrome and ~30% on Firefox, 5 time less.

I have an intuition that the infinite for loop may be causing this. So I'll have a try at refactoring the async execute function to run without using await, but by enqueuing execution task at the end of the function. Like this:

execute() {
  tick();
  // ...
  setZeroTimeout(execute, 0); 
}

I'll see how this goes !

urish commented 4 years ago

Excellent! Thanks for documenting your progress, it is really insightful :)

gfeun commented 4 years ago

My intuition was partly correct, by breaking of the for loop and rescheduling execution of the execute function I successfully improved performance on Firefox by 50% ! (from ~30% to ~80% on my machine)

This seems to indicate that Firefox may add delay when "awaiting". Probably not resuming function execution immediately when the promise resolves. This will mostly be unseen in other web app but here we await a lot so milliseconds adds up.

It doesn't change anything on Chrome however, performance is the same with await or this new method. This is fascinating to me how browser implementations can influence performance even on standard JS features.

It would be good if you could test the branch too, so that I'm sure the performance increase is real.

When checked, if it's ok for you, I can replace the "execute" function body with the "executeNoAsync" one and improve the "ugly" setZeroTimeout implementation

urish commented 4 years ago

My intuition was partly correct, by breaking of the for loop and rescheduling execution of the execute function I successfully improved performance on Firefox by 50% ! (from ~30% to ~80% on my machine)

Wow, that's great!

I will check the branch on my end and report shortly

gfeun commented 4 years ago

The branch is live here if you want to check: http://info2.hackervaillant.eu/

I'm now at 105% on Firefox (Maybe had some background load for the earlier results)

urish commented 4 years ago

So here are my results:

Firefox - around 115% (up from 48%) Chrome - around 170% (up from 140%)

👍 👍👍

gfeun commented 4 years ago

I cleaned up a bit.

I'm not sure the setTimeoutOptimized + handleMessage implementation belongs in the execute.ts file but don't know where else to put it.

I also have this code at the start of the execute function to initialize nextTick on first run

if (this.nextTick === 0) {
  this.nextTick = this.cpu.cycles + this.workUnitCycles;
}

This is the near equivalent of the previous let nextTick = this.cpu.cycles + workUnitCycles;

I wonder if this could be moved in the Runner constructor because this.cpu.cycles = 0 when execution starts. Except if the CPU resumes execution after a suspension, but it doesn't seem that execution suspension (pause) is supported right ?

urish commented 4 years ago

I'm not sure the setTimeoutOptimized + handleMessage implementation belongs in the execute.ts file but don't know where else to put it.

Perhaps we can add a utility call, something like MicrotaskScheduler or so, which will take care of scheduling the next task. Then we can also make the stop() method clean this task scheduler, so the execute method no longer needs to check if it has to stop.

This sort of abstraction will also allow us to adapt to the new main thread scheduling API when it will be ready. We can eventually support different implementations, depending on the run time (different browsers, node, etc.) and the available APIs

I wonder if this could be moved in the Runner constructor because this.cpu.cycles = 0 when execution starts. Except if the CPU resumes execution after a suspension, but it doesn't seem that execution suspension (pause) is supported right ?

Right, not supported at the moment

gfeun commented 4 years ago

I got rid of this.nextTick since we iterate for a fixed amount each time, no need to keep track of that anymore. I'll implement MicrotaskScheduler tomorrow :+1:

urish commented 4 years ago

Lovely, thank you!

One note - the existing code loops for 500,000 clock cycles. The new code that uses i for the loop executes for 500,000 instructions, which can take anything between 500,000 and 2.5m clock cycles. If we want to include #20 , looping for a given number of clock cycles will probably make our life much easier...

gfeun commented 4 years ago

Complete oversight from me over this CPU cycle != instruction. I fixed the loop. It now runs for a fixed amount of clock cycles again.

There is still a possible drift of some (1 to 3 ?) cycles per execution when the last executed instruction on the batch takes several CPU cycles but i guess comparing to 500000 it is not important. Just to keep in mind.

I also implemented the MicroTaskScheduler. I had some problems with this handling in the event handler function. this initially not refering to the class instance as i wanted.

So i found this pattern but i don't know if it is good practice, let me know.

handleMessage = (event: MessageEvent) => {
  // this correctly refers to the class instance
}

Since i'm completely new to Typescript don't hesitate to point bad practices or simply bad code, i'll be happy to correct :slightly_smiling_face:

gfeun commented 4 years ago

I think i implemented all your feedback.

However I'm stuck for the unit test. I tried to take inspiration from your other tests.

What blocks me is that, internally the scheduler uses window.postMessage api which is asynchronous.

So i don't know how to wait for the event listener to be executed.

There may be a solution around mocking window.postMessage but i'm not sure.

import { MicroTaskScheduler, IMicroTaskCallback } from './task-scheduler';

describe('task-scheduler', () => {
  it('should execute task', () => {
    const taskScheduler = new MicroTaskScheduler();
    const task = jest.fn();
    taskScheduler.start();
    taskScheduler.postTask(task);
    // How can i wait until the internal scheduler method handleMessage is called ?
    expect(task).toHaveBeenCalled();
  });
});

urish commented 4 years ago

So one approach would be to mock window.addEvenetListener and window.postMessage, and then fake the call to the event listener (this way you can also asset that parameters the are passed to postMessage, with expect).

Another way would just be to wait using setTimeout(), since we know postMessage is faster. i.e.: await new Promise(resolve => setTimeout(resolve)), just like we originally did in Runner.execute().

gfeun commented 4 years ago

Yeah I came upon this await new Promise(resolve => setTimeout(resolve))

However when i run the test i got this error: TypeError: demo/src/task-scheduler.spec.ts: Emit skipped

Maybe it has to do with some configuration in tsconfig ?

I put the test in the demo src folder under task-scheduler.spec.ts

gfeun commented 4 years ago

I have removed "noEmitOnError": true in tsconfig.json and now have other errors

  ● Test suite failed to run

    TypeScript diagnostics (customize using `[jest-config].globals.ts-jest.diagnostics` option):
    demo/src/task-scheduler.ts:14:7 - error TS2304: Cannot find name 'window'.

    14       window.addEventListener('message', this.handleMessage, true);
             ~~~~~~
    demo/src/task-scheduler.ts:20:5 - error TS2304: Cannot find name 'window'.

    20     window.removeEventListener('message', this.handleMessage, true);
           ~~~~~~
    demo/src/task-scheduler.ts:26:7 - error TS2304: Cannot find name 'window'.

    26       window.postMessage(this.messageName, '*');
             ~~~~~~
    demo/src/task-scheduler.ts:30:27 - error TS2304: Cannot find name 'MessageEvent'.

    30   handleMessage = (event: MessageEvent) => {
                                 ~~~~~~~~~~~~
    demo/src/task-scheduler.ts:31:26 - error TS2304: Cannot find name 'window'.

    31     if (event.source === window && event.data === this.messageName) {

So from what i understand, current tests run in a nodejs environment. Here since i use browser features, i have to find how to run test in a "browser" context. It seems to be done using "jsdom". I'm looking at how i could use that.

gfeun commented 4 years ago

That's what's called a rabbit hole ...

I fixed the previous error messages by adding "dom" lib to the tsconfig.json ("lib": ["es2015", "dom"]) I added a header to specify that this test has to run with with jsdom:

/**
 * @jest-environment jsdom
 */

My test is:

/**
 * @jest-environment jsdom
 */

import { MicroTaskScheduler } from './task-scheduler';

test('should execute task', async () => {
  const taskScheduler = new MicroTaskScheduler();
  const postTaskSpy = jest.spyOn(taskScheduler, 'postTask');
  const handleMessageSpy = jest.spyOn(taskScheduler, 'handleMessage');
  const fn = jest.fn();
  taskScheduler.start();
  taskScheduler.postTask(fn);
  await new Promise((resolve) => setTimeout(resolve, 200));

  expect(postTaskSpy).toHaveBeenCalled();
  expect(handleMessageSpy).toHaveBeenCalled();
  expect(fn).toHaveBeenCalled();
});

So this didn't work because jsdom implementation of "window.postMessage" doesn't fill the event.source (https://github.com/jsdom/jsdom/blob/020539ed3f46720fe526ecf55a3a2d2a889c94b4/lib/jsdom/living/post-message.js#L31)

And in the handleMessage I was checking if the event came from "window" and then if it had the correct message name.

After removing the if (event.source === "window"), the test passes !

On to adding other tests now ...

urish commented 4 years ago

So this didn't work because jsdom implementation of "window.postMessage" doesn't fill the event.source (https://github.com/jsdom/jsdom/blob/020539ed3f46720fe526ecf55a3a2d2a889c94b4/lib/jsdom/living/post-message.js#L31)

yeah, that's why I was thinking that it might make more sense to mock postMessage / addEventListener. But it seems like you eventually go it to work with jsdom, so bravo!

I fixed the previous error messages by adding "dom" lib to the tsconfig.json ("lib": ["es2015", "dom"]) Which tsconfig have you changed? The one that affects the tests in tsconfig.spec.json. We don't want the main tsconfig to assume DOM, to make sure we don't use any DOM constructs in the library itself (so that it can be consumed by both browsers and Node.js, and perhaps also compiled to Web Assembly using Assembly Script down the road).

I think my preferred alternative would be to leave the DOM library out of tsconfig altogether, and to specify in the relevant test file:

/// <reference lib="dom" />

gfeun commented 4 years ago

Ok I understand for the dom lib inclusion. I added the reference line and it works.

However there is still the initial error problem. I had to remove the "noEmitOnError": true from the root tsconfig.json so that i don't have the Emit skipped error happening in the CI at the moment.

I didn't push the modification because it touches the config so if you have an idea on how to properly deal with this... All this configurations are a bit overwhelming :slightly_smiling_face:

urish commented 4 years ago

Alright Glenn, I think I figured out the correct configuration. Can you please rebase on top of my latest commit from master?

gfeun commented 4 years ago

Rebased

The tests are green with the new conf !

urish commented 4 years ago

Hooray! Would you like to squash into one commit before I merge?

You can follow the convention of the other commits with the message: "perf(demo): improve main cpu loop performance"

gfeun commented 4 years ago

Done :slightly_smiling_face:

urish commented 4 years ago

So I guess we can now close #18 as well?

gfeun commented 4 years ago

Yes !

wokwi / avr8js

Improve main cpu loop performance #19

Benchmark on Demo

Profiling