Closed rothita closed 3 years ago
found out what was the problem: very long headers. I sovled it by adding condition in seqvec_embedder.py (line 125): for batch_idx, (sample_id, seq) in enumerate(batch): # for each seq in the batch if len(sample_id) >= (0x7fff * 2 + 1): sample_id =sample_id[0:(0x7fff)]
Hi, I'm running SeqVec-master/seqvec_embedder.py on some protein data that I have. my DB is splitted into chunks and while most chunks were embedded successfully, some jobs were failed with the following error:
I don't think it's a memory issue since I tried splitting those chunks into smaller ones and got the same error. do you have any idea what is causing the error and how to solve it? I didn't managed to find helpful solutions online.
thanks! Itai Roth