Closed pranavvp16 closed 1 year ago
Thanks for the PR. Do you a link to the git diff?
diff --git a/keras-io/examples/nlp/pretraining_BERT.py b/keras-core/examples/keras_io/nlp/pretraining_BERT.py
index cd2b7bc..8a19225 100644
--- a/keras-io/examples/nlp/pretraining_BERT.py
+++ b/keras-core/examples/keras_io/nlp/pretraining_BERT.py
@@ -84,14 +84,11 @@ import nltk
import random
import logging
-import tensorflow as tf
-from tensorflow import keras
+import keras_core as keras
nltk.download("punkt")
-# Only log error messages
-tf.get_logger().setLevel(logging.ERROR)
# Set random seed
-tf.keras.utils.set_random_seed(42)
+keras.utils.set_random_seed(42)
"""
### Define certain variables
@@ -463,9 +460,9 @@ Now we define our optimizer and compile the model. The loss calculation is handl
internally and so we need not worry about that!
"""
-optimizer = keras.optimizers.Adam(learning_rate=LEARNING_RATE)
+from keras.optimizers import Adam
-model.compile(optimizer=optimizer)
+model.compile(optimizer=Adam(learning_rate=LEARNING_RATE))
"""
Finally all steps are done and now we can start training our model!
@@ -507,4 +504,4 @@ model = TFBertForSequenceClassification.from_pretrained("your-username/my-awesom
In this case, the pretraining head will be dropped and the model will just be initialized with the transformer layers. A new task-specific head will be added with random weights. -""" +""" \ No newline at end of file
Training is slow even in GPU environment is it usual or I'm missing something, tried with different backends still the training time remains constant
Well it's a HuggingFace model that's being trained. There's virtually no overlap with Keras Core (just the use of the Adam optimizer). So time is going to be constant across backends (no difference), and it's going to be slow.
Training is slow even in GPU environment is it usual or I'm missing something, tried with different backends still the training time remains constant