Multiprocessing runs 100X+ slower than single simulation with certain simulations

SeanMcOwen commented 2 weeks ago

This notebook shows a current simulation model that runs very slow when moving to multi-processing.

The results are:

No deepcopy and 1 Monte Carlo run: 0.12 seconds No deepcopy and 5 monte carlo runs: 60.22 seconds / 5 = 12.04 seconds Deepcopy and 1 monte carlo: 2.79 seconds Deepcopy and 5 monte carlo runs: 59 seconds /5 = 11.90 Seconds

In a table then:

	Single Proc	Multi Proc
No Deepcopy	.12	12.04
Deepycopy	2.79	11.90

So we can see that on single simulations turning off deep copy speeds up a lot but no matter what in mutli-processing we run massively slower. Given that deepcopy has no effect it looks like it doesn't get triggered with multi-proc BUT it still runs much slower from other things.

Further context from @danlessa is:

"Regarding multiprocessing, those two threads have some context 2023-12 on [client project]: https://blockscienceteam.slack.com/archives/C05LRRUMGQM/p1703034551508919?thread_ts=1703019991.737059&cid=C05LRRUMGQM 2020-12, on using multiprocessing alternatives: https://blockscienceteam.slack.com/archives/CCYHUBHJ7/p1609220349006200"

The important information from the slack thread is: "as for the single thread result: this is related to how multi-processing in Python works. Processes cannot share memory directly, and they rely on IPC, which involves serialization of data. This is an expensive operation when dealing with objects generally. cadCAD uses pathos for parallelizing runs, which in turn depends on dill as a serializer. dill is particularly slow when compared to pickle as a serializer, however it can handle pretty much any kind of object, while pickle cannot. This is a problem without an easy and universal way out. Most performance improvements requires constraining use cases in some direction. If you're looking up for 10-100x speed-ups, then investing in an non-deepcopy compatible solution can definitely pay out, as it opens you the possibility of doing some clever hacks (like history erasure, which facilitates the serialization a lot)"

linear[bot] commented 2 weeks ago

CORE-126 Multiprocessing runs 100X+ slower than single simulation with certain simulations

SeanMcOwen commented 1 week ago

cadCAD-org / cadCAD

Multiprocessing runs 100X+ slower than single simulation with certain simulations #365