[RNN] GRU conversion/performance issues on CPU on Windows machines

Original Issue: https://github.com/tensorflow/tensorflow/issues/57977 Opening on behalf of @DLumi

1. System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 11 Home Single Language, ver 21H2
CPU: AMD Ryzen 7 6800H
Python: 3.10.4
TensorFlow installation (pip package or built from source): pip
TensorFlow library (version, if pip package or github SHA, if built from source): 2.10.0

2. Code

Please note that the issue is only noticeable on Windows machines (tested on 3 different PCs). In Colab and on a Linux machine I saw little to no decline in performance. https://colab.research.google.com/drive/1d6E3VjbN57ojDd1X0sfG2KA3x7wTMf5N?usp=sharing

3. Failure after conversion

Model fails to convert with default operation set. Conversion is successful with the extended operation set, however I saw about x3 decline in performance during inference on CPU.

4. (optional) RNN conversion support

If converting TF RNN to TFLite fused RNN ops, please prefix [RNN] in the title.

5. (optional) Any other info / logs

The conversion error traceback can be seen in the Colab notebook above. The issue is also present in TF 2.9.1, and it happens for both Intel and AMD CPUs.

google-ai-edge / LiteRT

[RNN] GRU conversion/performance issues on CPU on Windows machines #132

1. System information

2. Code

3. Failure after conversion

4. (optional) RNN conversion support

5. (optional) Any other info / logs