microsoft / tensorflow-directml-plugin

DirectML PluggableDevice plugin for TensorFlow 2
Apache License 2.0
185 stars 25 forks source link

Optimize output allocation for inputs that can be forwarded #262

Closed PatriceVignola closed 2 years ago

PatriceVignola commented 2 years ago

When in-place execution is possible (mostly for element-wise operators), forward one of the inputs if all following conditions exist:

  1. The input isn't used for another operator that hasn't yet been executed
  2. The input has the same datatype, shape and size as the output
  3. For composite operators, the input isn't used more than once. This is important since we're writing to the input/output in a non-deterministic fashion, so elements can't be relied upon more than once