tensorflow / privacy

Library for training machine learning models with privacy for training data
Apache License 2.0
1.93k stars 446 forks source link

Slowdown with TF-privacy LSTM on TensorFlow 2.4+ #141

Open zredlined opened 3 years ago

zredlined commented 3 years ago

Hey TF privacy team- we noticed a pretty significant slowdown working with TensorFlow privacy LSTM and GRU models on TF 2.4. This appears to only happen when using the TF-privacy optimizers. Here is an example Gist where (depending on the version of TF installed) training can go from 15 sec/epoch to 2 mins+ per epoch with the latest TF release candidate (tensorflow==2.4.0rc1).

Doing some testing, it looks like the slowdown was introduced in between these two tf-nightly builds.

Environment: GCP, running on Tesla V100, 16GB RAM, Ubuntu, 8 vCPU

Recreate the issue with this Gist https://gist.github.com/zredlined/72305ab04670197869e470b232d22ed4

I think this TensorFlow commit is the culprit-- changing use_new_code() back to True speeds the code back up. https://github.com/tensorflow/tensorflow/commit/73b709743a2eba2c912351e8d3334ef25e174c4b

def _use_new_code():
  return False 

The only reference I can find is in the issue above for what looks like an internal Google issue? Any help would be hugely appreciated, on most datasets we have tested with slowdowns are 10-20x. Thanks!

zredlined commented 3 years ago

Quick update. We have initial feedback from the TensorFlow team in the issue above that the new codepath was disabled due to an internal issue at Google.

This issue should affect any code running an LSTM with TF-privacy on TF 2.4+. Are there any options with TF-privacy to get past this slowdown, or to re-enable the codepath optionally?

aterzis-google commented 3 years ago

@zredlined can you add me to the thread with the TF team or point me to the github issue?

zredlined commented 3 years ago

@aterzis-google - any feedback or thoughts on our request in https://github.com/tensorflow/tensorflow/issues/44917 to add _use_new_code() as a user selectable parameter? This issue should affect anyone using an RNN/LSTM/GRU with TensorFlow Privacy. Thanks!

aterzis-google commented 3 years ago

Seems there's agreement in https://github.com/tensorflow/tensorflow/issues/44917 to make it a user selectable parameter.