Serialization fixes - Githubissues

thalassemia commented 2 years ago

BSON has a hidden maximum serialized size of 2GB (refer to this). In this PR, we switch to using the orjson package, which has no size limits and is generally faster than BSON anyways. Orjson also has the benefit of natively supporting Numpy arrays and types.

An important caveat is np.str_. Orjson will complain if dictionary keys are of the type np.str_ or if the data to serialize contains arrays with np.str_ values. The latter case is handled by a fallback Numpy array serializer, but users must manually ensure that all dictionary keys are Python strings and not Numpy strings.

Additionally, this PR makes some tweaks to multiprocessing:

ParallelProcesses no longer keep a reference to the original Process instance after initializing a separate OS process. This reduces RAM usage.
When parallelization is enabled, process schemas may be lost in transit. If this happens, the process schema is now retrieved anew when rebuilding the topology view.

By creating this pull request, I agree to the Contributor License Agreement, which is available in CLA.md at the top level of this repository.

thalassemia commented 2 years ago

Previously, pickling and unpickling of processes could get quite convoluted and messy. During pickling, we saved the parameters instance variable (e.g. self.parameters). To unpickle, we called the __init__ method on this saved parameters dictionary. This requires users to be very conscientious of the values contained within self.parameters, can be slow (depending on process size and complexity of its __init__), and causes unexpected results (like the bug noted above where process schemas are sometimes lost).

In 6c9a660, I reverted our custom process serialization code and was able to achieve a substantial performance improvement while rectifying a memory leak. To date, I have still not been able to find the exact source of this leak while running using this custom serialization code but can confirm that removing said code fixes the issue entirely.

thalassemia commented 2 years ago

I've addressed the review comments and added a more informative error message to help users diagnose serialization issues.

vivarium-collective / vivarium-core

Serialization fixes #215