amaiya / ktrain

ktrain is a Python library that makes deep learning and AI more accessible and easier to apply
Apache License 2.0
1.23k stars 269 forks source link

Can't use community transformer model #324

Closed MeranaTona closed 3 years ago

MeranaTona commented 3 years ago

Hi there!

Thank you for that simplified library. I am trying to use a very small version of Bert from Google on Hugging Face due to my slow CPU. Unfortunately it can't be located. Can't figure out to make it work. Maybe because I'm new.

import ktrain
from ktrain import text

MODEL_NAME = 'google/bert_uncased_L-2_H-128_A-2'  

t = text.Transformer(MODEL_NAME, maxlen=54, class_names=labels_list)  
trn = t.preprocess_train(x_train, y_train)  
val = t.preprocess_test(x_validation, y_validation)  
model = t.get_classifier()  
learner = ktrain.get_learner(model, train_data=trn, val_data=val, batch_size=6)  

Which results in:

preprocessing train...
language: en
train sequence lengths:
    mean : 24
    95percentile : 46
    99percentile : 51
Is Multi-Label? True
preprocessing test...
language: en
test sequence lengths:
    mean : 23
    95percentile : 46
    99percentile : 51
404 Client Error: Not Found for url: https://huggingface.co/google/bert_uncased_L-2_H-128_A-2/resolve/main/tf_model.h5

I have also tried loading it like this:

from transformers import AutoModel, AutoModelForSequenceClassification, TFAutoModelForSequenceClassification, AutoConfig
config = AutoConfig.from_pretrained(MODEL_NAME)
config.num_labels = 9
model = TFAutoModelForSequenceClassification.from_pretrained(MODEL_NAME, from_pt=True, config=config)
model.summary()
Model: "tf_bert_for_sequence_classification_6"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
bert (TFBertMainLayer)       multiple                  4385920   
_________________________________________________________________
dropout_55 (Dropout)         multiple                  0         
_________________________________________________________________
classifier (Dense)           multiple                  1161      
=================================================================
Total params: 4,387,081
Trainable params: 4,387,081
Non-trainable params: 0
_________________________________________________________________

But it results in nonetype:

learner = ktrain.get_learner(model, train_data=trn, val_data=val, batch_size=6)
learner.lr_find(show_plot=True, max_epochs=2)
simulating training for different learning rates... this may take a few moments...
AttributeError: 'NoneType' object has no attribute 'lr'

My data are strings and multilabels with 9 labels:

x_train[1:3]
['!attention everyone!!! new scammer!!!! roblox user: xjust_natsukixx what she scams: robux i can show proof she scams if needed♡ (i know this because a friend got scammed out of 1000 robux from them!) #scammer #scammeralert #omg  #donttrust #donttrustthem #royalehighscammer',
 '" i do a wonderful work, in a wonderful way, i give wonderful service, for wonderful pay!” what a wonderful mantra!!! #beautiful #mantra  #createyourlife']

y_train[1:3]
array([[0, 1, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 1, 0]], dtype=int32)

Thank you in advance.

MeranaTona commented 3 years ago

Okay, after adding following, my second approach of using TFAutoModelForSequenceClassification does not result in nonetype anymore. Althought the model is treating the problem more like a multilabel regression problem instead of a multilabel classification

model.compile(loss='categorical_crossentropy',optimizer='adam', metrics=['accuracy'])
predictor = ktrain.get_predictor(model, t)
predictor.predict("What are we doing today?")

Out[105]: [('label1', 0.4994808),
 ('label2', 0.5707391),
 ('label3', 0.4876785),
 ('label4', 0.5431454),
 ('label5', 0.50143594),
 ('label6', 0.45981723),
 ('label7', 0.49624988),
 ('label8', 0.51856476),
 ('label9', 0.4871253)]
amaiya commented 3 years ago

Hello.

The 404 client error is simply saying that there is no TensorFlow version of this model (which is True). This is the first time I've seen this warning, so perhaps it is a warning from a newer version of requests or one of its dependencies. In any case, under these cases, ktrain then tries to load the model as a PyTorch model (which is successful in this case). If you run the entire code and ignore the 404 warning, everything should work (and it does for me):

# load text data
categories = ['alt.atheism', 'soc.religion.christian','comp.graphics', 'sci.med']
from sklearn.datasets import fetch_20newsgroups
train_b = fetch_20newsgroups(subset='train', categories=categories, shuffle=True)
test_b = fetch_20newsgroups(subset='test',categories=categories, shuffle=True)
(x_train, y_train) = (train_b.data, train_b.target)
(x_test, y_test) = (test_b.data, test_b.target)

# build and train model
import ktrain
from ktrain import text
MODEL_NAME = 'google/bert_uncased_L-2_H-128_A-2'  
t = text.Transformer(MODEL_NAME, maxlen=500, class_names=train_b.target_names)
trn = t.preprocess_train(x_train, y_train)
model = t.get_classifier()
learner = ktrain.get_learner(model, train_data=trn, batch_size=6)
learner.fit_onecycle(5e-5, 1)
MeranaTona commented 3 years ago

Okay thank you very much :)