Accenture / AmpliGraph

Python library for Representation Learning on Knowledge Graphs https://docs.ampligraph.org
Apache License 2.0
2.14k stars 251 forks source link

Memory allocation/deallocation #149

Closed c0ntradicti0n closed 4 years ago

c0ntradicti0n commented 5 years ago

Description

Training some model, it allocates and deallocates big amounts of memory. Couldn't it be reusing this memory and be faster therefore? Selection_053

Steps to Reproduce

Train a bigger model with ComplEx

sumitpai commented 4 years ago

Which version of TensorFlow are you using? (cpu/gpu)?

c0ntradicti0n commented 4 years ago

I looked and found version 1.14 and I'm using cpu only.

sumitpai commented 4 years ago

I think you see this behaviour because tensorflow would need large amount of memory to store the gradients during backprop and it would release it after the weight updates. I think each cycle there would represent training of a batch of data.

We will run some tests to see if it is due to some issue in the data loading pipeline or a general tensorflow behavior on CPU(during backprop). I like the suggestion of reusing the memory to speed up the process. We will consider this in our future releases and see if we can optimize the training on CPU.