kamalkraj / ALBERT-TF2.0

ALBERT model Pretraining and Fine Tuning using TF2.0
Apache License 2.0
200 stars 45 forks source link

tf.saved_model.save and predict a single value #25

Closed Bidek56 closed 4 years ago

Bidek56 commented 4 years ago

In order to save the model, I have added this line after the training loop: tf.saved_model.save(model, os.path.join(FLAGS.output_dir, "1") ) in order to get: assets, saved_model.pb and variables

from there, I am trying to load the model and predict a single value:

loaded = tf.saved_model.load( os.path.join(model_dir, "1") )

tokenizer = tokenization.FullTokenizer(vocab_file=None,spm_model_file=spm_model_file, do_lower_case=True)

text_a = "the movie was not good"
example = classifier_data_lib.InputExample(guid=0, text_a=text_a, text_b=None, label=0)

labels = [0, 1]
max_seq_length = 128

feature = classifier_data_lib.convert_single_example(ex_index=0, example=example, label_list=labels, max_seq_length=max_seq_length, tokenizer=tokenizer)

test_input_word_ids =tf.convert_to_tensor([feature.input_ids], dtype=tf.int32, name='input_word_ids')
test_input_mask     =tf.convert_to_tensor([feature.input_mask], dtype=tf.int32, name='input_mask')
test_input_type_ids =tf.convert_to_tensor([feature.segment_ids], dtype=tf.int32, name='input_type_ids')

logit = loaded.signatures["serving_default"]( input_mask=test_input_mask,input_type_ids=test_input_type_ids,input_word_ids=test_input_word_ids )

pred = tf.argmax(logit['output'], axis=-1, output_type=tf.int32)
prob = tf.nn.softmax(logit['output'], axis=-1)

print(f'Prediction: {pred} Probabilities: {prob}')

This solution works for a single value. Thanks

Bidek56 commented 4 years ago

It turns out that due to a TF 2.0 bug loading model from HDF5 file is not possible.

birdmw commented 4 years ago

m HDF5 file is not possible.

Well... we are loading from an init checkpoint which is stored in hdf5 when we pass the parameter --init_checkpoint=${ALBERT_DIR}/tf2_model.h5, so I suppose there is some support for it?

Bidek56 commented 4 years ago

Loading --init_checkpoint=${ALBERT_DIR}/tf2_model.h5 works, what does not work is saving a trained model to .h5 file and using it for predictions.

birdmw commented 4 years ago

Loading --init_checkpoint=${ALBERT_DIR}/tf2_model.h5 works, what does not work is saving a trained model to .h5 file and using it for predictions.

I see what you are saying. Are you trying to make a frozen inference graph? like this: https://datascience.stackexchange.com/questions/33975/what-is-the-difference-between-tensorflow-saved-model-pb-and-frozen-inference-gr

Bidek56 commented 4 years ago

Just trying to save the trained model for subsequent serving with Flask.

birdmw commented 4 years ago

We're in the same boat. We are doing it on DataBricks. Had some extra errors with the callbacks. So we commented them out, borrowed your code, and put it in a custom callback and it works now. So cheers. `class MyCustomCallback(tf.keras.callbacks.Callback):

def on_train_batch_begin(self, batch, logs=None): print('Training: batch {} begins at {}'.format(batch, datetime.datetime.now().time()))

def on_train_batch_end(self, batch, logs=None): print('Training: batch {} ends at {}'.format(batch, datetime.datetime.now().time())) print("saving model as per callback to:", os.path.join(FLAGS.output_dir, "1")) tf.saved_model.save(model, os.path.join(FLAGS.output_dir, "1") ) print("model saved")

def on_test_batch_begin(self, batch, logs=None): print('Evaluating: batch {} begins at {}'.format(batch, datetime.datetime.now().time()))

def on_test_batch_end(self, batch, logs=None): print('Evaluating: batch {} ends at {}'.format(batch, datetime.datetime.now().time()))`

a la https://www.tensorflow.org/guide/keras/custom_callback

birdmw commented 4 years ago

@Bidek56 do you mind sharing a little more information? Exactly what line in what file did you add this 'tf.saved_model.save(model, os.path.join(FLAGS.output_dir, "1") )' ? Also, Why are you doing an os.path.join? You are trying to make a subfolder called "1" to save to?

Bidek56 commented 4 years ago

I have added the following line of code after this line:

tf.saved_model.save(model, os.path.join(FLAGS.output_dir, "1") )

os.path.join just adds /1 subfolder to output_dir, it's pure Python.

008karan commented 4 years ago

@birdmw Can tell what changes you did to add your custom callbacks. When I am running training its printing out results at the end of each epoch. I want to see at every step. I am not able to figure it out. can you help?