apple / tensorflow_macos

TensorFlow for macOS 11.0+ accelerated using Apple's ML Compute framework.
Other
3.67k stars 310 forks source link

Transformer hugginface BERT model not working #271

Open bksaini078 opened 3 years ago

bksaini078 commented 3 years ago

While fine-tuning the transformers model i.e.transformers.TFDistilBertModel.from_pretrained(pretrained_weights) I got this error message. image Can someone please help how to resolve this issue? Or, someone able to run the Transfomer BERT models in mac M1?

Reference code:

def BERT_model(max_len,pretrained_weights):
    '''BERT model creation with pretrained weights
    max_len: input length '''
    # parameter declaration
    learning_rate=2e-5
    optimizer=tf.keras.optimizers.Adam(learning_rate=learning_rate)

    bert=transformers.TFDistilBertModel.from_pretrained(pretrained_weights)

    # declaring inputs, BERT take input_ids and attention_mask as input
    input_ids= Input(shape=(max_len,),dtype=tf.int32,name='input_ids')
    attention_mask=Input(shape=(max_len,),dtype=tf.int32,name='attention_mask')

    distillbert= bert(input_ids,attention_mask=attention_mask)
    x= distillbert[0][:,0,:]
    x=tf.keras.layers.Dropout(0.2)(x)
    x= tf.keras.layers.Dense(64)(x)
    x=tf.keras.layers.Dense(32)(x)

    output=tf.keras.layers.Dense(2,activation='sigmoid')(x)

    model=Model(inputs=[input_ids,attention_mask],outputs=[output])
    # compiling model 
    model.compile(optimizer=optimizer,loss='binary_crossentropy', metrics=['accuracy'])
    return model
model.fit(x_train,y_train,batch_size=8,epochs=3,validation_split=0.2,verbose=1)
haesookimDev commented 3 years ago

change layer to x=tf.keras.layers.Dropout(0.2)(x) x= tf.keras.layers.Dense(64)(x) x=tf.keras.layers.Dense(32)(x) x=tf.keras.layers.Dense(2,activation='sigmoid')(x) output=tf.keras.layers.Dropout(0)(x)

Because if there's an Activation function on the last layer, there's a problem, so I'm going to add a Dropout layer that doesn't do anything on the last layer.

bksaini078 commented 3 years ago

Thank you for your reply, I tried the proposed approach. Unfortunately, it is showing the same error message. Did you run the BERT model successfully on your end?

jaismith commented 3 years ago

hey there @bksaini078, were you able to load the BERT from tensorflow-hub? if so, would you mind showing how you did that? i'm unable to load the BERT model using hub.KerasLayer (see #276)