keras-team / keras-cv

Industry-strength Computer Vision workflows with Keras
Other
1.01k stars 330 forks source link

Models that don't support XLA should throw an error when compiled with XLA #1752

Closed LukeWood closed 1 year ago

LukeWood commented 1 year ago

an example of this is EfficientNetV2. if you pass jit_compile=True to the model, you'll get:

RuntimeError: `merge_call` called while defining a new graph or a tf.function. This can often happen if the futributed_apply_gradients_fnnction `fn` passed to `strategy.run()` contains a nested `@tf.function`, and the nested `@tf.function` contains a
synchronization point, such as aggregating gradients (e.g, optimizer.apply_gradients), or if the function `fn` uses a control flow statement which contains a synchronization point in the body. Such behaviors are not yet supportenction `fn` passed to `strategy.run()` contains a nested `@d. Instead, please avoid nested `tf.function`s or control flow statements that may potentially cross a synchronizaoptimizer.apply_gradients), or if the function `fn` uses ation boundary, for example, wrap the `fn` passed to `strategy.run` or the entire `strategy.run` inside a `tf.functnstead, please avoid nested `tf.function`s or control flowion` or move the control flow out of `fn`. If you are subclassing a `tf.keras.Model`, please avoid decorating overun` or the entire `strategy.run` inside a `tf.function` orridden methods `test_step` and `train_step` in `tf.function`.                          

Not very helpful for a user that doesn't have a deep understanding of TF/Keras/XLA. We can address this in backbones directly, and in ImageClassifier/OD models/etc by checking some sort of supports_xla attribute backbones can expose.

innat commented 1 year ago

keras-nlp uses xla-compatible checker, used.

IMvision12 commented 1 year ago

@innat I am not getting any error, am i doing something wrong.

import tensorflow as tf
from tensorflow import keras
import keras_cv
import os

tpu = None
try:
    tpu = tf.distribute.cluster_resolver.TPUClusterResolver()
    tf.config.experimental_connect_to_cluster(tpu)
    tf.tpu.experimental.initialize_tpu_system(tpu)
    strategy = tf.distribute.experimental.TPUStrategy(tpu)
except:
    strategy = tf.distribute.get_strategy()

def create_model():
    with strategy.scope(): 
        backbone = keras_cv.models.EfficientNetV2Backbone.from_preset("efficientnetv2_b0_imagenet")
        model = keras_cv.models.ImageClassifier(backbone=backbone,
                                                num_classes=10,
                                                activation="softmax")

        # Compile model
        model.compile(loss = 'sparse_categorical_crossentropy',
                      optimizer= tf.keras.optimizers.Adam(),
                      metrics = ['accuracy'], jit_compile=True)

        return model

model = create_model()
ID6109 commented 1 year ago

@IMvision12, I believe @innat was providing a reference to the XLA compatibility checker used by keras-nlp and not implying that keras-cv already has one.

ianstenbit commented 1 year ago

All models should support XLA now that we're on Keras Core