wenliangdai / Multimodal-End2end-Sparse

The code repository for NAACL 2021 paper "Multimodal End-to-End Sparse Model for Emotion Recognition".
95 stars 16 forks source link

OverflowError: cannot serialize a bytes object larger than 4 GiB #14

Closed HoaiDuyLe closed 2 years ago

HoaiDuyLe commented 2 years ago

Hi, When I tried to train baselines (LF_RNN and LF_Transformer) on the MOSEI dataset, I got the error "OverflowError: cannot serialize a bytes object larger than 4 GiB". Did you face this issue? Any suggestion to solve this? Thanks in advance.

SamuelCahyawijaya commented 2 years ago

Hi @HoaiDuyLe, I suspect you are using a single pickle file to save all your data, perhaps, you can try to use pickle with protocol=4

pickle.dump(d, open("ile_path, 'w'), protocol=4)

For your reference, here is the explanation of protocol version 4 from https://docs.python.org/3/library/pickle.html: Protocol version 4 was added in Python 3.4. It adds support for very large objects, pickling more kinds of objects, and some data format optimizations. It is the default protocol starting with Python 3.8.

Also, please make sure that you are not using FAT32 file system which only support a maximum of 4GB per file. Hope it helps!

HoaiDuyLe commented 2 years ago

Thank you for your reply. But I downloaded and used the pre-processed dataset from your hyperlink. I didn't process the dataset by myself. Btw, I am using Windows 10 OS, so NTFS is the default.