iTwin / imodel-transformer

API for exporting an iModel's parts and also importing them into another iModel
MIT License
3 stars 2 forks source link

dont wait for targetDb writes to perform sourceDb reads #9

Open MichaelBelousov opened 1 year ago

MichaelBelousov commented 1 year ago

Currently due to the synchronous nature of IModelDb.elements.getElement and IModelDb.elements.insertElement and other core methods, the transformer spends a lot of time just waiting for potentially parallel operations (reading from one sqlite database and writing to another) to complete.

Stripping the transformer of relying on both the target and source to be loaded simultaneously may be difficult but would allow us to provide an IModelImporter that abstracts over a child_process.fork which can asynchronously write to the target while the source is read and more transformed elements are queued for writing.

This could have large performance savings.

MichaelBelousov commented 1 year ago

I have started some experiments on this, working on pushing up a branch

MichaelBelousov commented 1 year ago

Experiment is blocked on this: https://github.com/orgs/nodejs/discussions/47395 There doesn't seem to be a proper way to wait for backpressure to release when using node.js's forked child processes.

Other than that, on smaller iModels where it worked, saw ~20% speed up

MichaelBelousov commented 1 year ago

I stopped using node's built-in fork IPC messaging api and started manually setting up the ipc and using the v8 serialization API to send array-buffer containing messages between the processes. I see about 10-15% speed up on a 90MB iModel in initial tests.

It does however get much much worse on big models (tested on 2GB), probably due to memory exhaustion and poor back pressure handling, the transformer needs a refactor to let it wait on certain operations that are now implicitly async, because as is garbage collection time grows intensely while it accumulates many many buffered calls for the write.

MichaelBelousov commented 9 months ago

another (harder) option is to do everything lower in SQL, and attach the source database to the target.