Slowdown with TF-privacy LSTM on TensorFlow 2.4+

zredlined commented 4 years ago

Hey TF privacy team- we noticed a pretty significant slowdown working with TensorFlow privacy LSTM and GRU models on TF 2.4. This appears to only happen when using the TF-privacy optimizers. Here is an example Gist where (depending on the version of TF installed) training can go from 15 sec/epoch to 2 mins+ per epoch with the latest TF release candidate (tensorflow==2.4.0rc1).

Doing some testing, it looks like the slowdown was introduced in between these two tf-nightly builds.

tf-nightly==2.4.0.dev20201019 - 15 sec/epoch
tf-nightly==2.4.0.dev20201020 and TensorFlow RC1 - 2 mins+/epoch

Environment: GCP, running on Tesla V100, 16GB RAM, Ubuntu, 8 vCPU

Recreate the issue with this Gist https://gist.github.com/zredlined/72305ab04670197869e470b232d22ed4

I think this TensorFlow commit is the culprit-- changing use_new_code() back to True speeds the code back up. https://github.com/tensorflow/tensorflow/commit/73b709743a2eba2c912351e8d3334ef25e174c4b

def _use_new_code():
  return False

The only reference I can find is in the issue above for what looks like an internal Google issue? Any help would be hugely appreciated, on most datasets we have tested with slowdowns are 10-20x. Thanks!

zredlined commented 4 years ago

Quick update. We have initial feedback from the TensorFlow team in the issue above that the new codepath was disabled due to an internal issue at Google.

This issue should affect any code running an LSTM with TF-privacy on TF 2.4+. Are there any options with TF-privacy to get past this slowdown, or to re-enable the codepath optionally?

aterzis-google commented 4 years ago

@zredlined can you add me to the thread with the TF team or point me to the github issue?

zredlined commented 4 years ago

@aterzis-google - any feedback or thoughts on our request in https://github.com/tensorflow/tensorflow/issues/44917 to add _use_new_code() as a user selectable parameter? This issue should affect anyone using an RNN/LSTM/GRU with TensorFlow Privacy. Thanks!

aterzis-google commented 4 years ago

Seems there's agreement in https://github.com/tensorflow/tensorflow/issues/44917 to make it a user selectable parameter.

tensorflow / privacy

Slowdown with TF-privacy LSTM on TensorFlow 2.4+ #141

Environment: GCP, running on Tesla V100, 16GB RAM, Ubuntu, 8 vCPU