Cannot serialize a bytes object larger than 4 GiB

TsingZ0 / PFLlib

37 traditional FL (tFL) or personalized FL (pFL) algorithms, 3 scenarios, and 20 datasets.

GNU General Public License v2.0

1.44k stars 298 forks source link

Cannot serialize a bytes object larger than 4 GiB #110

Closed ajulyav closed 1 year ago

ajulyav commented 1 year ago

Hello,

I believe when creating large datasets (splits), the user will always face this error in the "savez_compressed" function. It could be re-written as:

import pickle

for idx, train_dict in enumerate(train_data):
     with open(train_path + str(idx) + '.npz', 'wb') as f:
         pickle.dump(train_dict, f, protocol=4)

Though, this will require rewriting other functions when training and opening .npz

So, my suggestion to adapt code to deal with large datasets.

Thank you!

yu-4041 commented 1 month ago

how do you solve this problem