Closed bogedy closed 9 months ago
Hi, Thanks a lot for your interest in the INSTRUCTOR model!
As the instance has only 24GB of memory, it is likely that the GPU is overloaded. To save memory, you may try to reduce the batch size, or use a shorter sequence length with the following code:
model.max_seq_length = 256
Feel free to add any further questions or comments!
I find this odd though, the model's pytorch_model.bin
is only ~5GB. My instance crashes even with BATCH_SIZE=1
model.max_seq_length = 64
. Shouldn't I have plenty of memory?
And isn't this a curious issue that it's able to take my whole instance offline? I'm used to CUDA out of memory issues but not this.
It's so weird that it's probably my own issue and has nothing to do with this repo, but I thought I should check.
Hi, is the problem solved? Can you generate the embeddings without using batching?
Feel free to re-open this issue if you have any further questions or comments!
This only happens with the XL model, large and smaller seem to work fine.
Here's how I import it and verify that it's working:
Then I batch process using Hugging Face datasets:
This will either crash my ipykernel or worse take my entire EC2 instance offline. Seems like this shouldn't be happening, the model needs 5GB of VRAM and my g5.xlarge instance has 24GB.
Am I doing the batching correctly? This is the only way I could make it work/make it make sense.
Thanks!