UKPLab / sentence-transformers

State-of-the-Art Text Embeddings
https://www.sbert.net
Apache License 2.0
14.85k stars 2.44k forks source link

Does encoding multiple sentences using model.encode, have a limit to no. of sentences accepted? #1896

Open ankitas3 opened 1 year ago

ankitas3 commented 1 year ago

The SentenceTransformers Documentation says that multiple sentences are also accepted while encoding. Quoting: Even though we talk about sentence embeddings, you can use it also for shorter phrases as well as for longer texts with multiple sentences Is there any limit to length of array of sentences that is acceptable here?

nreimers commented 1 year ago

Depending on the machine, at some point you will run out of memory, as each vector needs ~768*4 bytes of memory

ankitas3 commented 1 year ago

So there's no limit as such but has to be handled depending on the memory. Thanks @nreimers

ankitas3 commented 1 year ago

@nreimers Is there any optimal number for array length maybe based on the system core CPU/GPU to have an idea around this.