Closed Jain-Abhilash closed 1 year ago
These are out-of-memory (OOM) errors. The OOM errors seem to be happening when making predictions with xlnet
model on a single large set of examples (regardless of the batch size set). XLNET has some sort of issue when invoking model.predict
on a large set of examples. Since it is only happening with XLNET and not any other models (e.g., BERT, ROBERTA), it seems like it may be an issue with either transformers or TensorFlow and not ktrain.
In any case, the workaround is to batchify the dataset yourself and feed the batches to predict
. Here is a self-contained example of the workaround (where STEP 3 is the actual workaround):
# STEP 1: load text data
categories = ['alt.atheism', 'soc.religion.christian','comp.graphics', 'sci.med']
from sklearn.datasets import fetch_20newsgroups
train_b = fetch_20newsgroups(subset='train', categories=categories, shuffle=True)
test_b = fetch_20newsgroups(subset='test',categories=categories, shuffle=True)
(x_train, y_train) = (train_b.data, train_b.target)
(x_test, y_test) = (test_b.data, test_b.target)
# STEP 2: build and train XLNet
import ktrain
from ktrain import text
MODEL_NAME = 'xlnet-base-cased'
t = text.Transformer(MODEL_NAME, maxlen=500, class_names=train_b.target_names)
trn = t.preprocess_train(x_train, y_train)
val = t.preprocess_test(x_test, y_test)
model = t.get_classifier()
learner = ktrain.get_learner(model, train_data=trn, val_data=None, batch_size=6)
learner.fit_onecycle(5e-5, 1)
# STEP 3: make predictions
p = ktrain.get_predictor(learner.model, t)
from ktrain import utils as U
batches = U.batchify(x_test, 32)
preds = []
for batch in batches:
preds.extend(p.predict(batch))
# STEP 4: ground truth
ground_truth = [train_b.target_names[y] for y in y_test]
# STEP 5: evaluate
from sklearn.metrics import classification_report
print(classification_report(ground_truth, preds))
# OUTPUT
# precision recall f1-score support
#
# alt.atheism 0.83 0.86 0.85 319
# comp.graphics 0.98 0.96 0.97 38
# sci.med 0.94 0.97 0.96 396
#soc.religion.christian 0.92 0.88 0.90 398
# accuracy 0.92 1502
# macro avg 0.92 0.92 0.92 1502
# weighted avg 0.92 0.92 0.92 1502
P.S. ktrain>=0.33.4
will suppress the progress bars automatically when running predict
in a for
loop
Hey, im getting the below error when I'm trying to call the validate function of learner:
Training went well with no issues, I'm using A100 GPUS, with the latest K-train and Tensorflow and this issue is only with XLNET and when i ran this with distilBert there were no issues at all My code is: