tunib-ai / parallelformers

Parallelformers: An Efficient Model Parallelization Toolkit for Deployment
https://tunib-ai.github.io/parallelformers
Apache License 2.0
776 stars 61 forks source link

Speed up results serialization #46

Open mkardas opened 1 year ago

mkardas commented 1 year ago

Describe a requested feature

I was running some performance tests and I noticed that checking if an object is pickable: https://github.com/tunib-ai/parallelformers/blob/ccaea515ee2e4d7540f2a275f6cdb0c33a7780f0/parallelformers/parallel/process.py#L209 takes a lot of time when the output is big (f.e., when a model returns a large logits tensor), because the whole object is being serialized into memory and then deserialized. I wonder what are the cases in which check_pickable helps, as dataclasses and ModelOutput should be as pickable as its dictionary representation.

If the check is still needed, I guess the code could be still sped up by modifying an object only on pickle failure. That would require some workarounds (perhaps overriding https://github.com/python/cpython/blob/9dc787ea96916552695e79397588fdfa68f22024/Lib/multiprocessing/queues.py#L275) so I want to make sure the check is still necessary, before giving it a shot. Another option is to always check for https://github.com/tunib-ai/parallelformers/blob/ccaea515ee2e4d7540f2a275f6cdb0c33a7780f0/parallelformers/parallel/process.py#L236-L239 and modify the object even if it's pickable, but that would remove custom fields added outside a definition of a given class.