Adjust strategy in TensorFlow code to allow for more than 2 GPUs

Switch Strategy - to cross_device_ops - working for more than 2 GPUs

On 4 L4s or 3 RTX-4500/4500/4000

https://github.com/tensorflow/tensorflow/issues/41724#issuecomment-665996179

strategy = tf.distribute.MirroredStrategy(cross_device_ops=tf.distribute.ReductionToOneDevice())
parallel_model.fit(x_train, y_train, epochs=25, batch_size=2048)

|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA L4                      Off | 00000000:00:03.0 Off |                    0 |
| N/A   80C    P0              62W /  72W |  21002MiB / 23034MiB |     58%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   1  NVIDIA L4                      Off | 00000000:00:04.0 Off |                    0 |
| N/A   78C    P0              67W /  72W |  20994MiB / 23034MiB |     46%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   2  NVIDIA L4                      Off | 00000000:00:05.0 Off |                    0 |
| N/A   76C    P0              67W /  72W |  20998MiB / 23034MiB |     55%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   3  NVIDIA L4                      Off | 00000000:00:06.0 Off |                    0 |
| N/A   75C    P0              51W /  72W |  21002MiB / 23034MiB |     55%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A     40306      C   python                                    20990MiB |
|    1   N/A  N/A     40306      C   python                                    20982MiB |
|    2   N/A  N/A     40306      C   python                                    20986MiB |
|    3   N/A  N/A     40306      C   python                                    20990MiB |
+---------------------------------------------------------------------------------------+

Epoch 24/25
25/25 [==============================] - 3s 105ms/step - loss: 0.2089 - accuracy: 0.9445
Epoch 25/25
25/25 [==============================] - 3s 105ms/step - loss: 0.1559 - accuracy: 0.9592

ObrienlabsDev / machine-learning

Adjust strategy in TensorFlow code to allow for more than 2 GPUs #4

Switch Strategy - to cross_device_ops - working for more than 2 GPUs