Error while trying to embed data

Hi, I'm running SeqVec-master/seqvec_embedder.py on some protein data that I have. my DB is splitted into chunks and while most chunks were embedded successfully, some jobs were failed with the following error:

Traceback (most recent call last): File "/home/seqvec/SeqVec//lib/SeqVec-master/seqvec_embedder.py", line 258, in main() File "/home/seqvec/SeqVec//lib/SeqVec-master/seqvec_embedder.py", line 254, in main cpu_flag, max_chars, per_prot, verbose ) File "/home/seqvec/SeqVec//lib/SeqVec-master/seqvec_embedder.py", line 168, in get_embeddings np.savez( emb_path, *emb_dict) File "<__array_function__ internals>", line 6, in savez File "/home/seqvec/SeqVec/env/lib/python3.6/site-packages/numpy/lib/npyio.py", line 616, in savez _savez(file, args, kwds, False) File "/home/seqvec/SeqVec/env/lib/python3.6/site-packages/numpy/lib/npyio.py", line 720, in _savez with zipf.open(fname, 'w', force_zip64=True) as fid: File "/home/software/anaconda3/lib/python3.6/zipfile.py", line 1355, in open return self._open_to_write(zinfo, force_zip64=force_zip64) File "/home/software/anaconda3/lib/python3.6/zipfile.py", line 1468, in _open_to_write self.fp.write(zinfo.FileHeader(zip64)) File "/home/software/anaconda3/lib/python3.6/zipfile.py", line 427, in FileHeader len(filename), len(extra)) struct.error: ushort format requires 0 <= number <= (0x7fff 2 + 1)

I don't think it's a memory issue since I tried splitting those chunks into smaller ones and got the same error. do you have any idea what is causing the error and how to solve it? I didn't managed to find helpful solutions online.

thanks! Itai Roth

Rostlab / SeqVec

Error while trying to embed data #21