MartinoMensio / spacy-universal-sentence-encoder

Google USE (Universal Sentence Encoder) for spaCy
MIT License
176 stars 12 forks source link

Avoiding OOM errors #24

Closed jimkang closed 1 year ago

jimkang commented 1 year ago

Hi, I installed this via pip to try it out, and I'm getting OOM errors like this:

2022-09-08 16:35:25.958426: W tensorflow/core/framework/op_kernel.cc:1780] OP_REQUIRES failed at matmul_op_impl.h:728 : RESOURCE_EXHAUSTED: OOM when allocating tensor with shape[1,8,70151,70151] and type float on /job:localhost/replica:0/task:0/device:CPU:0 by allocator cpu

Is there a way I can tell spacy to tell tensorflow that it's OK to use more memory?

MartinoMensio commented 1 year ago

Hi @jimkang, When does this error happen? Are you using the vectors down in a pipeline using tensorflow?

I think you can have different types of issues:

Best, Martino

jimkang commented 1 year ago

Thanks, Martino! It was that memory actually did run out.

After taking another look, I realized that the pipe expects a single sentence, rather than an entire document. (Also, strangely, when I call add_pipe I get 'universal_sentence_encoder' already exists in pipeline, so I used replace_pipe to set enable_cache.)

Thanks for confirming that it shouldn't normally run out of memory on a reasonably-sized sentence!