JTerwel / late-time_lc_binner

1 stars 0 forks source link

Workers copying memory vs sharing it #14

Open JTerwel opened 2 years ago

JTerwel commented 2 years ago

The way the multiprocesses are implemented now, all data is copied each time a new worker starts. A lot of this data is fixed and has the same value each time. If I can figure out how to share memory between parent and workers (read only preferred), then this doesn't need to happen, and might speed up things a little (assuming some of the copies that don't need to be made are of large objects).

Also, workers live as long as a pool unless specified when creating the pool. If they live shorter than the pool (e.g. for 5 tasks) they are afterwards destroyed & replaced. This ensures all memory is freed correctly & might speed things up a little as well. (Good to play around with some time to test stuff)

JTerwel commented 2 years ago

If I understand Python correctly, the first part I wrong. We have a list of list of object names, each time one of these lists gets copied to each worker. This list contains just object names, which are basically references to the objects themselves which are stored elsewhere in the memory). So as long as nothing inside these objects is changed, there is no harm. I think there should be a safer option available where these objects cannot be changed but for this program it does not matter.

The second part still stands though, so this issue won't be closed for now.