facebook / Rapid

The OpenStreetMap editor driven by open data, AI, and supercharged features
https://rapideditor.org
ISC License
524 stars 92 forks source link

Web workers #297

Open bhousel opened 3 years ago

bhousel commented 3 years ago

We can move a bunch of our low priority but main-thread-blocking tasks onto web workers for better performance. I'm planning some work on this for the next version.

bhousel commented 3 years ago

Dropping useful links here

Docs:

webworker perf:

mapbox-gl related worker benchmarks:

serialization benchmarks:

I'm really enjoying working with threads.js and have made some small progress on moving some background GeoJSON processing to a worker:

It's looking like avro / avsc is the frontrunner for if we want to pack and larger messages into a Transferable. Update: this turned out not to be true, not worth a penalty of a few 100 milliseconds just to pack/unpack smaller messages.

bhousel commented 3 years ago

I did some testing with moving some pieces of the coreLocations functionality into a worker. Here's what I learned.

Overview

coreLocations maintains an internal index of all the boundaries/geofences used by iD. This code is used by presets, community index, background imagery, to know where in the world things are valid.

iD uses this code for a lot of things, for example knowing which countries use a VAT ID field, where Aldi Nord vs Aldi Süd operate, and mapping out the Burger King zone of exclusion. These geofences are defined by locationSets and they need to be pieced together soon after iD starts up. Currently, the coreLocations module will merge and resolve these locationSets in the background during idle times. This works ok - but iD stutters a bit for the first few seconds when it starts up.

This work is 1. time consuming and 2. already asynchronous, so that makes it well suited to move into a web worker.

Challenges

Here are some things that make web workers hard to work with.

  1. Workers can't do everything main thread can do. Notably, they can't touch the DOM.
  2. A worker has its own isolated environment - worker threads can't (yet) share memory with the main thread.
    • I initially had hoped to run validators in a worker thread, but without access to context, history, graph, etc, we really can't do much with actual OSM data without big architectural changes.
  3. Workers are message based. That means calling a worker is necessarily asynchronous. Much of the code in iD is written in a synchronous way, and going from sync to async is really complicated. (see What Color Is Your Function?)
  4. More on the messages. Sending data to and from workers is time consuming. There are 2 ways to do it:
    • postMessage and structured cloning - The browser handles sending the message between main and worker threads. This is not free and it does cost some milliseconds of cpu time (which afaict is spread between both main and worker threads)
    • transferrable objects - you can pack a message into an ArrayBuffer and transfer ownership instantly to the other thread (the memory is not accessable from the sending thread anymore). This is also not free, as you need to spend time packing/unpacking the messages.
  5. Not really a problem with the web workers themselves but - I initially expected to add worker functions in the same files as the main functions and export them separately and have esbuild make separate iD.js and worker.js bundles. It turns out esbuild is not great at tree shaking (dead code elimination), so the worker.js file ended up with all the main thread's imported code. It's just easier to put the worker stuff in a separate file for now, otherwise the worker bundle ends up with all of d3, id-sdk and whatever else.

The process

Here's what I did:

  1. Try to move all of coreLocations into a worker. Wrote a thin proxy that passed messages back and forth with postMessage. This seemed to work ok except..
  2. Now calling locationManager.locationsAt() was no longer synchronous. This is the code that determines which locations a preset or field is available. This means preset matching is no longer synchronous and all kinds of stuff throughout the code just won't work, so...
  3. Can we keep locationsAt() synchronous? Only by having a copy of the location index on the main thread...
  4. So if the location index needs to live on on the main thread, all the worker can do is resolve the locationSets into GeoJSON and pass them back to the main thread to get indexed. It's still a win, but...
  5. These GeoJSONs are big so now I have to figure out how to efficiently pack and unpack them as transferrable objects... Or do the work in chunks..

Some benchmarks

These are interesting because iD first merges the core presets (only 9 locationSets), then fetches the NSI presets (15839 locationSets), then does an unnecessary merge (0 locationSets). So we can see how much different sizes of messages cost.

Using postMessage

Screen Shot 2021-10-01 at 5 38 07 PM

Using Transferable Object (and very simple TextEncode/TextDecode pack/unpack)

Screen Shot 2021-10-01 at 5 46 11 PM

TODO: going to try the avro library and see if I can get those main thread unpack times down.. Update:

Using Transferable Object (and Avro encoder/decoder)

Screen Shot 2021-10-05 at 1 20 13 AM ) It looks like the pack/unpack times went up a bit, (though the sizes of the buffers transferred is a lot lower), so Avro looks to not be as much of a win as I had hoped it would be.

Bonkles commented 3 years ago

https://i.gifer.com/UJr.gif