keras-team / keras-io

Keras documentation, hosted live at keras.io
Apache License 2.0
2.69k stars 2.01k forks source link

Updated Named Entity Recognition using Transformers example for Keras 3 #1817

Closed sitamgithub-MSIT closed 2 months ago

sitamgithub-MSIT commented 2 months ago

Corresponding Issue

This PR updates the Named Entity Recognition using Transformers Keras 3.0 example [TF Only Backend]. Many TF ops are replaced with corresponding Keras ops.

For example, here is the notebook link provided: https://colab.research.google.com/drive/10cfqKFFs0Fy9VilbV7CRGLOcD8tTpqf-?usp=sharing

cc: @fchollet

The following describes the Git difference for the changed files:

Changes: ``` diff --git a/examples/nlp/ner_transformers.py b/examples/nlp/ner_transformers.py index c887c0bc..02f22c0b 100644 --- a/examples/nlp/ner_transformers.py +++ b/examples/nlp/ner_transformers.py @@ -37,8 +37,8 @@ import os os.environ["KERAS_BACKEND"] = "tensorflow" -import os import keras +from keras import ops import numpy as np import tensorflow as tf from keras import layers @@ -94,8 +94,8 @@ class TokenAndPositionEmbedding(layers.Layer): self.pos_emb = keras.layers.Embedding(input_dim=maxlen, output_dim=embed_dim) def call(self, inputs): - maxlen = tf.shape(inputs)[-1] - positions = tf.range(start=0, limit=maxlen, delta=1) + maxlen = ops.shape(inputs)[-1] + positions = ops.arange(start=0, stop=maxlen, step=1) position_embeddings = self.pos_emb(positions) token_embeddings = self.token_emb(inputs) return token_embeddings + position_embeddings @@ -270,9 +270,9 @@ class CustomNonPaddingTokenLoss(keras.losses.Loss): from_logits=False, reduction=None ) loss = loss_fn(y_true, y_pred) - mask = tf.cast((y_true > 0), dtype=tf.float32) + mask = ops.cast((y_true > 0), dtype="float32") loss = loss * mask - return tf.reduce_sum(loss) / tf.reduce_sum(mask) + return ops.sum(loss) / ops.sum(mask) loss = CustomNonPaddingTokenLoss() @@ -281,6 +281,7 @@ loss = CustomNonPaddingTokenLoss() ## Compile and fit the model """ +tf.config.run_functions_eagerly(True) ner_model.compile(optimizer="adam", loss=loss) ner_model.fit(train_dataset, epochs=10) @@ -294,7 +295,7 @@ def tokenize_and_convert_to_ids(text): sample_input = tokenize_and_convert_to_ids( "eu rejects german call to boycott british lamb" ) -sample_input = tf.reshape(sample_input, shape=[1, -1]) +sample_input = ops.reshape(sample_input, shape=[1, -1]) print(sample_input) output = ner_model.predict(sample_input) @@ -317,10 +318,10 @@ def calculate_metrics(dataset): for x, y in dataset: output = ner_model.predict(x, verbose=0) - predictions = np.argmax(output, axis=-1) - predictions = np.reshape(predictions, [-1]) + predictions = ops.argmax(output, axis=-1) + predictions = ops.reshape(predictions, [-1]) - true_tag_ids = np.reshape(y, [-1]) + true_tag_ids = ops.reshape(y, [-1]) mask = (true_tag_ids > 0) & (predictions > 0) true_tag_ids = true_tag_ids[mask] (END) ```
sitamgithub-MSIT commented 2 months ago

LGTM, thank you! Please update the generated files.

Yes updated the generated files!