Closed Aravinviju closed 5 years ago
I have faced similar issues. The only thing you can do is create tensors that can fit in your memory. This error can solved by reducing text vector dimension. But that will lead towards low accuracy of your model. You can try reducing batch size which will not affect the model much and will throw this error.
For more help, you need to share your code/model architecture so that one can understand how much memory you are actually allocating to tensors and where we can reduce the memory usage.
I have faced similar issues. The only thing you can do is create tensors that can fit in your memory. This error can solved by reducing text vector dimension. But that will lead towards low accuracy of your model. You can try reducing batch size which will not affect the model much and will throw this error.
For more help, you need to share your code/model architecture so that one can understand how much memory you are actually allocating to tensors and where we can reduce the memory usage.
It is not actually a model built by me, it is a pre-existing text embedding model called Transformer Architecture from google, (for more info : https://www.learnopencv.com/universal-sentence-encoder/?ck_subscriber_id=272164240). Basically I use this embeding technique to get the vectors and then use them for clustering. Anyway I'll try changing the batch and get back.
Thanks Arav
@Aravinviju I looked at the model. Basically, it is the model which can generate word embeddings given a text as input. Once, I too needed to use word and character embeddings as a part of my model. But I came to know that I would require very high specs PC to do this task which wasn't possible for me.
I too wanted to develop such model in order to learn how it works but that wasn't possible. So, I used pretrained word and character embeddings and then passsed them to a Bidirectional LSTM so that they learn contextual information based on the problem set.
Gensim is most popular library for this purpose. Even though, I used pymagnitude in my model. Both are very easy to use and can help you out.
BTW, reducing the batch size will definitely work but it still depends on your specs. Use a binary search like method to find a range of batch size that will work on your device.
Thank you.. Update me and I will be here for help.
@ParikhKadam
Thanks a lot for your reply!
Yes indeed, but this particular embedding is better than the basic one's. I have tried the pre-trained embedings both Gensim (for machine learning) and also a word embedding model (https://www.cs.york.ac.uk/nlp/extvec/) from google (for a CNN classification model) in the initial phase. The current use-case i'm working on requires it to understand the meaning or irregular text for which I needed Sentence and paragraph embedding - Which is actually done by DAN and TA from google (same link provided before ), thus makes the model more understanding and gives better results too.
The point you suggested - reducing the batch size, in this process, its not exactly a model training I'm doing so I wasn't sure giving a value for the batch size in embedding. But still I split my dataset and sent the data for embedding in batches and getting the embedding results as lists and then finally joining the embedding results as on large file.
In terms of the device, as I stated in the details of the question, its a GCP instance with a Teslak80 GPU for which even 1GB of text data is easy enough to process. I was just processing 100MB of data at a time, but since its converting it to 512 dimension vectors relatively the batch size is also more it couldn't handle it.
Thanks for you help @ParikhKadam Will get back if I need anything else! For now I'll close this issue!
Cheers Arav
@Aravinviju Welcome.. Happy to help.
@ParikhKadam , Thank you for resolving this query. This issue is closed.
I am getting a ResourceExhaustedError: OOM - When Doing an embedding usinfg google's trasformer Architecture which embeds the text into a 512 dimensional vectors.
The data I'm trying to embed has 5000 records which adds up to 40MB of data. GPU used: Tesla k80 in a GCP instance. CPUs : 4 (15mb RAM) Tensorflow: tensorflow-gpu (3.0.1)
HERE IS THE CODE SNIPPET:
with tf.Session() as session: session.run([tf.global_variables_initializer(), tf.tables_initializer()]) message_embeddings = session.run(embed(test_cleansed_data))
Here is the log:
ResourceExhaustedError Traceback (most recent call last) ~/anaconda3/envs/cluster_gpu/lib/python3.6/site-packages/tensorflow/python/client/session.py in _do_call(self, fn, args) 1326 try: -> 1327 return fn(args) 1328 except errors.OpError as e:
~/anaconda3/envs/cluster_gpu/lib/python3.6/site-packages/tensorflow/python/client/session.py in _run_fn(feed_dict, fetch_list, target_list, options, run_metadata) 1311 return self._call_tf_sessionrun( -> 1312 options, feed_dict, fetch_list, target_list, run_metadata) 1313
~/anaconda3/envs/cluster_gpu/lib/python3.6/site-packages/tensorflow/python/client/session.py in _call_tf_sessionrun(self, options, feed_dict, fetch_list, target_list, run_metadata) 1419 self._session, options, feed_dict, fetch_list, target_list, -> 1420 status, run_metadata) 1421
~/anaconda3/envs/cluster_gpu/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py in exit(self, type_arg, value_arg, traceback_arg) 515 compat.as_text(c_api.TF_Message(self.status.status)), --> 516 c_api.TF_GetCode(self.status.status)) 517 # Delete the underlying status object from memory otherwise it stays alive
ResourceExhaustedError: OOM when allocating tensor with shape[4096000,128] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[Node: module_apply_default_5/Encoder_en/Transformer/TransformerEncodeFast/encoder/layer_0/self_attention/multihead_attention/dot_product_attention/Softmax = SoftmaxT=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
During handling of the above exception, another exception occurred:
ResourceExhaustedError Traceback (most recent call last)