Different behaviors on mac M1

oeddyo commented 8 months ago

Hi there,

Not sure if I have something get wrong but it seems somehow the Relu activation is not triggered on Mac M1.

Here's exactly the same code run on Google Colab v.s. Mac M1:

import tensorflow as tf
from tensorflow.keras.layers import Dense, Input, ReLU
from sklearn.datasets import make_moons
import matplotlib.pyplot as plt
from tensorflow.keras.optimizers import RMSprop

print("TensorFlow version:", tf.__version__)
print("Keras version:", keras)

# Prepare the data
X, y = make_moons(n_samples=1000, noise=0.1)

batch_size = 128

# Define the model using Functional API
inputs = Input(shape=(2,))
x = Dense(200)(inputs)
x = ReLU()(x)  # Using ReLU as a layer
x = Dense(200)(x)
x = ReLU()(x)  # Using ReLU as a layer
outputs = Dense(1, activation='sigmoid')(x)
model = tf.keras.Model(inputs=inputs, outputs=outputs)

# Compile the model
model.compile(optimizer="adam", loss='binary_crossentropy', metrics=['accuracy'])

# Train the model
model.fit(X, y, epochs=50, batch_size=batch_size)

print(model.predict(X)[:10])

# Create a meshgrid for visualization
x_min, x_max = X[:, 0].min() - 0.5, X[:, 0].max() + 0.5
y_min, y_max = X[:, 1].min() - 0.5, X[:, 1].max() + 0.5
xx, yy = np.meshgrid(np.linspace(x_min, x_max, 100), np.linspace(y_min, y_max, 100))

# Predict with the model
mesh = np.c_[xx.ravel(), yy.ravel()]
mesh_predictions = model.predict(mesh).round()
mesh_predictions = mesh_predictions.reshape(xx.shape)

# Plot the decision boundary
plt.contourf(xx, yy, mesh_predictions, alpha=0.3, cmap='jet')
plt.scatter(X[:, 0], X[:, 1], c=y, s=20, cmap='jet')
plt.xlim(xx.min(), xx.max())
plt.ylim(yy.min(), yy.max())
plt.title("Decision Boundary")
plt.show()

On GCS:

On mac:

Both print:

TensorFlow version: 2.15.0
Keras version: <module 'keras.api._v2.keras' from '/Users/eddiexie/miniconda3/envs/machine-learning/lib/python3.10/site-packages/keras/api/_v2/keras/__init__.py'>

dugujiujian1999 commented 8 months ago

Check if you're training with MPS enabled. If that's the case, consider disabling it and return the operation in CPU mode. This step will help determine if MPS is contributing to the problem. And Try:

from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())

Try the code below. And see what you get. I suspect the issue might be related to the optimizer. Consider switching to SGD or another alternative to see if it resolves the problem.

relu_layer = ReLU(
        max_value=5,
        negative_slope=0.2,
        threshold=0,
)
print(relu_layer(np.array([-10, -5, 0.0, 5, 10])))

At last, try downgrade your tf version to 2.14 to evaluate whether the issue persists.

oeddyo commented 8 months ago

Thanks @dugujiujian1999 for getting back so fast.

For the first suggestion:

[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 472974225874757572
xla_global_id: -1
, name: "/device:GPU:0"
device_type: "GPU"
locality {
  bus_id: 1
}
incarnation: 14547657671560620541
physical_device_desc: "device: 0, name: METAL, pci bus id: <undefined>"
xla_global_id: -1
]
tf.Tensor([-2. -1.  0.  5.  5.], shape=(5,), dtype=float32)

I tried sgd but the issue persists. It seems to still be linear on the boundary. I suspect something's off for me with ReLU

oeddyo commented 8 months ago

Downgrading to 2.14 indeed fixed the issue

sampathweb commented 8 months ago

From the conversation, this issue seems to be related to TensorFlow Mac build. Can we move this issue discussion to TensorFlow Repo?

sachinprasadhs commented 8 months ago

Hi,

I was able to run this successfully on MacOS M1 using Keras 3.

Make sure you install keras 3 using !pip install -U keras

Below is the code and the output.

import os
os.environ["KERAS_BACKEND"]="tensorflow"
import keras
from keras import layers
from keras import ops
import numpy as np
from sklearn.datasets import make_moons
import tensorflow as tf
import matplotlib.pyplot as plt

print("TensorFlow version:", tf.__version__)
print("Keras version:", keras)

# Prepare the data
X, y = make_moons(n_samples=1000, noise=0.1)

batch_size = 128

# Define the model using Functional API
inputs = layers.Input(shape=(2,))
x = layers.Dense(200)(inputs)
x = layers.ReLU()(x)  # Using ReLU as a layer
x = layers.Dense(200)(x)
x = layers.ReLU()(x)  # Using ReLU as a layer
outputs = layers.Dense(1, activation='sigmoid')(x)
model = keras.Model(inputs=inputs, outputs=outputs)

# Compile the model
model.compile(optimizer="adam", loss='binary_crossentropy', metrics=['accuracy'])

# Train the model
model.fit(X, y, epochs=50, batch_size=batch_size)

print(model.predict(X)[:10])

# Create a meshgrid for visualization
x_min, x_max = X[:, 0].min() - 0.5, X[:, 0].max() + 0.5
y_min, y_max = X[:, 1].min() - 0.5, X[:, 1].max() + 0.5
xx, yy = np.meshgrid(np.linspace(x_min, x_max, 100), np.linspace(y_min, y_max, 100))

# Predict with the model
mesh = np.c_[xx.ravel(), yy.ravel()]
mesh_predictions = model.predict(mesh).round()
mesh_predictions = mesh_predictions.reshape(xx.shape)

# Plot the decision boundary
plt.contourf(xx, yy, mesh_predictions, alpha=0.3, cmap='jet')
plt.scatter(X[:, 0], X[:, 1], c=y, s=20, cmap='jet')
plt.xlim(xx.min(), xx.max())
plt.ylim(yy.min(), yy.max())
plt.title("Decision Boundary")
plt.show()

Output:

```python Epoch 1/50 2024-01-03 11:35:02.822966: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:117] Plugin optimizer for device_type GPU is enabled. 8/8 ━━━━━━━━━━━━━━━━━━━━ 4s 67ms/step - accuracy: 0.6789 - loss: 0.6520 Epoch 2/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 15ms/step - accuracy: 0.8316 - loss: 0.4897 Epoch 3/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 14ms/step - accuracy: 0.8579 - loss: 0.3706 Epoch 4/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 13ms/step - accuracy: 0.8675 - loss: 0.3028 Epoch 5/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 14ms/step - accuracy: 0.8738 - loss: 0.2705 Epoch 6/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 14ms/step - accuracy: 0.8729 - loss: 0.2628 Epoch 7/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 14ms/step - accuracy: 0.8703 - loss: 0.2456 Epoch 8/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 14ms/step - accuracy: 0.8736 - loss: 0.2378 Epoch 9/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 41ms/step - accuracy: 0.8871 - loss: 0.2232 Epoch 10/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 13ms/step - accuracy: 0.8970 - loss: 0.2084 Epoch 11/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - accuracy: 0.8936 - loss: 0.2040 Epoch 12/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 13ms/step - accuracy: 0.9118 - loss: 0.1913 Epoch 13/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 13ms/step - accuracy: 0.9234 - loss: 0.1636 Epoch 14/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - accuracy: 0.9354 - loss: 0.1548 Epoch 15/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - accuracy: 0.9279 - loss: 0.1597 Epoch 16/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 13ms/step - accuracy: 0.9497 - loss: 0.1314 Epoch 17/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 13ms/step - accuracy: 0.9479 - loss: 0.1186 Epoch 18/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step - accuracy: 0.9646 - loss: 0.0905 Epoch 19/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - accuracy: 0.9578 - loss: 0.0865 Epoch 20/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 13ms/step - accuracy: 0.9743 - loss: 0.0712 Epoch 21/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 13ms/step - accuracy: 0.9810 - loss: 0.0585 Epoch 22/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 35ms/step - accuracy: 0.9853 - loss: 0.0568 Epoch 23/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - accuracy: 0.9925 - loss: 0.0468 Epoch 24/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 13ms/step - accuracy: 0.9930 - loss: 0.0394 Epoch 25/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 13ms/step - accuracy: 0.9946 - loss: 0.0332 Epoch 26/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 32ms/step - accuracy: 0.9970 - loss: 0.0291 Epoch 27/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 13ms/step - accuracy: 0.9993 - loss: 0.0234 Epoch 28/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 13ms/step - accuracy: 0.9993 - loss: 0.0193 Epoch 29/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - accuracy: 0.9997 - loss: 0.0197 Epoch 30/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 33ms/step - accuracy: 0.9997 - loss: 0.0177 Epoch 31/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 13ms/step - accuracy: 0.9995 - loss: 0.0133 Epoch 32/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - accuracy: 0.9975 - loss: 0.0142 Epoch 33/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 13ms/step - accuracy: 0.9984 - loss: 0.0122 Epoch 34/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 26ms/step - accuracy: 0.9991 - loss: 0.0098 Epoch 35/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 14ms/step - accuracy: 0.9993 - loss: 0.0095 Epoch 36/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 13ms/step - accuracy: 0.9991 - loss: 0.0112 Epoch 37/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 13ms/step - accuracy: 0.9995 - loss: 0.0086 Epoch 38/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 33ms/step - accuracy: 0.9975 - loss: 0.0099 Epoch 39/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 15ms/step - accuracy: 0.9988 - loss: 0.0079 Epoch 40/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 13ms/step - accuracy: 1.0000 - loss: 0.0071 Epoch 41/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 13ms/step - accuracy: 0.9997 - loss: 0.0070 Epoch 42/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 13ms/step - accuracy: 0.9993 - loss: 0.0069 Epoch 43/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 39ms/step - accuracy: 1.0000 - loss: 0.0071 Epoch 44/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - accuracy: 1.0000 - loss: 0.0054 Epoch 45/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 13ms/step - accuracy: 1.0000 - loss: 0.0048 Epoch 46/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 14ms/step - accuracy: 1.0000 - loss: 0.0056 Epoch 47/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 35ms/step - accuracy: 0.9997 - loss: 0.0048 Epoch 48/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - accuracy: 1.0000 - loss: 0.0047 Epoch 49/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - accuracy: 1.0000 - loss: 0.0042 Epoch 50/50 8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 23ms/step - accuracy: 1.0000 - loss: 0.0040 32/32 ━━━━━━━━━━━━━━━━━━━━ 1s 14ms/step [[2.4329200e-04] [9.9946028e-01] [4.2080835e-02] [9.5800601e-07] [9.9958545e-01] [7.6349257e-05] [3.7100435e-07] [9.9878830e-01] [9.9961358e-01] [9.9791068e-01]] 313/313 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step ```

dugujiujian1999 commented 8 months ago

I'm not quite sure as I don't have a Mac with M1 to test, but according to the description from pypi.org/project/tensorflow-metal, the currently confirmed latest supported version is only 2.14.

oeddyo commented 8 months ago

@sachinprasadhs May I know what TF version you're using?

oeddyo commented 8 months ago

By using tf=2.14.0 with keras 3.0.0 I'm able to get non-linearity


tensorflow-deps           2.10.0                        0    apple
tensorflow-estimator      2.14.0                   pypi_0    pypi
tensorflow-io-gcs-filesystem 0.34.0                   pypi_0    pypi
tensorflow-macos          2.14.0                   pypi_0    pypi
tensorflow-metal          1.1.0                    pypi_0    pypi```

```keras                     3.0.0                    pypi_0    pypi```

sachinprasadhs commented 8 months ago

@oeddyo , Use latest Tensorflow 2.15 and Keras 3.0.2 which is compatible.

github-actions[bot] commented 8 months ago

This issue is stale because it has been open for 14 days with no activity. It will be closed if no further activity occurs. Thank you.

github-actions[bot] commented 7 months ago

This issue was closed because it has been inactive for 28 days. Please reopen if you'd like to work on this further.

google-ml-butler[bot] commented 7 months ago

Are you satisfied with the resolution of your issue? Yes No

keras-team / keras

Different behaviors on mac M1 #19009