google-research / albert

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
Apache License 2.0
3.23k stars 569 forks source link

KerasLayer of albert from tensorflow hub has no trainable parameters in tensorflow 2 #112

Closed r-wheeler closed 4 years ago

r-wheeler commented 4 years ago

This was a bit confusing as I thought this issue would be fixed from the update in the readme

***************New January 7, 2019 ***************

V2 TF-Hub models should be working now. See updated TF-Hub links below.

With a very minimal example, the weights of albert and other modules do not show as trainable as there are 0 trainable variables.

import tensorflow as tf 
import tensorflow_hub as hub

input_ids = tf.keras.layers.Input(shape=[None], dtype=tf.int32)
input_mask = tf.keras.layers.Input(shape=[None], dtype=tf.int32)
sequence_mask = tf.keras.layers.Input(shape=[None], dtype=tf.int32)

albert = hub.KerasLayer(
    "https://tfhub.dev/google/albert_xlarge/3",
    trainable=True,
    signature="tokens",
    output_key="pooled_output",
)

features = {
    "input_ids": input_ids,
    "input_mask": input_mask,
    "segment_ids": sequence_mask,
}
out = albert(features)
model = tf.keras.Model(inputs=[input_ids, input_mask, sequence_mask], outputs=out)
model.compile("adam", loss="sparse_categorical_crossentropy")
model.summary()

Produces the following summary:

Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_28 (InputLayer)           [(None, None)]       0                                            
__________________________________________________________________________________________________
input_29 (InputLayer)           [(None, None)]       0                                            
__________________________________________________________________________________________________
input_30 (InputLayer)           [(None, None)]       0                                            
__________________________________________________________________________________________________
keras_layer_60 (KerasLayer)     (None, 2048)         59017392    input_28[0][0]                   
                                                                 input_29[0][0]                   
                                                                 input_30[0][0]                   
==================================================================================================
Total params: 59,017,392
Trainable params: 0
Non-trainable params: 59,017,392

Expected Behavior: Trainable variables of Albert or bert models to be shown as trainable parameters

Observed Behavior: There are 0 trainable parameters shown in the model summary when setting trainable to True.

After discussing this issue with tensorflow hub developers, the issue is that the model is in the TF1 hub.Module format and not the SavedModelV2. Here is the linked issue in tensorflow hubs github.

Can you provide a version on tensorflow hub that allows the models to be fine tuned?

0x0539 commented 4 years ago

Ah, sorry for the confusion. The update in the readme was for users of TF1.15. We haven't really tested ALBERT in TF2.0 and don't have plans to release a TF2.0-compatible version of ALBERT modules (yet).

r-wheeler commented 4 years ago

Cool -- thanks for the prompt response. Huge request to be able to fine tune in tensorflow 2.