Closed ievapudz closed 9 months ago
@xing-he529, in order to understand, why this error occured, I would need:
After testing, I found that the key factor is protein length. When I tested using a long protein (10881 bp), it got this error:
[xinghe@T64021 TemStaPro]$ ./temstapro -f ./input/test.fa -d ./ProtTrans/ -e tests/outputs/ --mean-output ./input/test.fa.tsv 2023-05-31 00:08:30.364604: beginning to load the model 2023-05-31 00:08:43.281625: finished loading the model 2023-05-31 00:08:43.281776: beginning to generate embeddings ./temstapro: runtime error generating embedding for Phes|1197G000056 (L=10881). Try lowering batch size. If single sequence processing does not work, you need more vRAM to process your protein. Portion 1. 0/1: sequences with generated mean embeddings 0/1: sequences with generated per-residue embeddings 0:00:00.935153: time to generate embeddings ./temstapro: no embeddings were generated.
May I ask you to help me fix this error, btw, the vRAM in my system is 16M.
Such unusually long proteins require more RAM to be processed by the program. We have provided guidelines that in most cases it is possible to run the program having 16 GB of RAM (as I understand, that is exactly the amount that you have, correct me if I misunderstood what is meant by "16M").
If possible, I would suggest you to run the program for the protein of such length on a machine with more RAM.
@xing-he529 I have moved the comment to another issue with a more relevant title.
Hello,
This program can be successfully performed using my test dataset(~ 300 sequences), but it got an error when I used a larger dataset (~2w sequences), how can I solve this problem?
The error I got:
btw, it's a nice program! Congratulations!
Thanks,X
Originally posted by @xing-he529 in https://github.com/ievapudz/TemStaPro/issues/1#issuecomment-1545258920