autonomio / talos

Hyperparameter Experiments with TensorFlow and Keras
https://autonom.io
MIT License
1.62k stars 268 forks source link

Memory leak problem (tensorflow backend) #342

Closed Adnane017 closed 5 years ago

Adnane017 commented 5 years ago

Hi,

This is related to issues #206 and #238. I am doing a grid search using the scan() function to find the best parameters for both feedforward and LSTM architectures. For both, the scan function stops writing results in the csv file after 15-20 iterations. This happens for both CPU and GPU versions of tensorflow. The recommendation from #206 was to add 'del keras_model' and 'gc.collect()'. My question is where do you put these lines? Is it at the start of the function where you build the model and which is called by the scan() function? But then how can you get the last keras model that was built to be able to delete it?

To give you some context, I am using a dataset with about 500k rows. The dimension of the input for the FNN is 10^4 (retaining the 10^4 most frequent word), while the output has dimension 7. For LSTM the input has dimension 20*10^4 (I am using the first 20 words in each sentence and I am not using embedding). I am also employing generators to call the batch of data only.

If you need more information, please let me know. Thanks!

mikkokotila commented 5 years ago

If you are <v.0.6 then Scan( ... clear_tf_session=True ... ) and if >=v.0.6 then Scan( ... clear_session=True ... ). That will handle it.

I'm closing this, feel free to open a new ticket if anything.