rstudio / tfhub

R interface to TensorFlow Hub
28 stars 7 forks source link

Loading Bert using layer_hub #19

Closed horlar1 closed 2 years ago

horlar1 commented 4 years ago

Hi @skeydan

I'm having issues with input shape while loading bert with layer_hub. i'm using the example on this page. https://tensorflow.rstudio.com/guide/tfhub/examples/text_classification/

Here is my code and error while trying to pass the input to the embeddings library(keras) library(tfhub) library(readr)

Build the model ---------------------------------------------------------

We the token based text embedding trained on English Google News 130GB

https://tfhub.dev/google/tf2-preview/gnews-swivel-20dim/1

The model is available at the above URL.

embeddings <- layer_hub( handle = "https://tfhub.dev/tensorflow/bert_en_uncased_L-12_H-768_A-12/2", trainable = TRUE)

input <- layer_input(shape =shape(), dtype = "string", name = "input")

output <- input %>% embeddings() %>% layer_dense(units = 1, activation = "sigmoid")

model <- keras_model(input, output)

ERROR MESSAGE WARNING:tensorflow:AutoGraph could not transform <tensorflow.python.saved_model.function_deserialization.RestoredFunction object at 0x000001D5C3EEAB70> and will run it as-is. Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, export AUTOGRAPH_VERBOSITY=10) and attach the full output. Cause: Could not find matching function to call loaded from the SavedModel. Got: Positional arguments (3 total):

Expected these arguments to match one of the following 4 option(s):

Option 1: Positional arguments (3 total):

Option 2: Positional arguments (3 total):

Option 3: Positional arguments (3 total):

Option 4: Positional arguments (3 total):

Kindly assist.

jonathanbratt commented 4 years ago

Hi @horlar1, Fellow package user here. I'm no expert, but maybe my experience can help a little. The code below worked for me.

library(keras)
library(tensorflow)
library(tfhub)

max_seq_length <- 128  
input_word_ids <- keras::layer_input(shape=c(max_seq_length), dtype="int32",
                                     name="input_word_ids")
input_mask <- keras::layer_input(shape=c(max_seq_length), dtype="int32",
                                 name="input_mask")
segment_ids <- keras::layer_input(shape=c(max_seq_length), dtype="int32",
                                  name="segment_ids")

bert_url <- "https://tfhub.dev/tensorflow/bert_en_uncased_L-12_H-768_A-12/2"
bert_layer <- tfhub::layer_hub(handle = bert_url,
                               trainable = TRUE)

# The outputs will actually be a list: pooled_output, sequence_output
outputs <- bert_layer(list(input_word_ids, input_mask, segment_ids))
# You could feed these outputs through other layer functions at this point.

# Make a model defined by inputs and outputs.
bert_model <- keras::keras_model(inputs = list(input_word_ids, 
                                input_mask, 
                                segment_ids),
                  outputs = outputs)

# Running the model:

# Some example text, tokenized and indexed. 
# [CLS]    hi    my  name    is  bert [SEP]
# 101  7632  2026  2171  2003 14324   102

# Use the above token indices as example input
this_seq <- c(101,7632,2026,2171,2003,14324,102)
this_seq_length <- length(this_seq)
input_word_ids <- matrix(c(this_seq,
                           rep(0, max_seq_length - this_seq_length)),
                         nrow = 1, ncol = max_seq_length)
# mask should be 1 for "real" tokens, then pad with zeroes
input_mask <- matrix(c(rep(1, this_seq_length),
                       rep(0, max_seq_length-this_seq_length)),
                     nrow = 1, ncol = max_seq_length)
# Only one segment here, so segment id is just 1 everywhere
segment_ids <- matrix(rep(1, max_seq_length), 
                      nrow = 1, ncol = max_seq_length)

# Copy the example to fill out a batch.
batch_size <- 1
input_word_ids <- t(replicate(batch_size, input_word_ids, simplify = "matrix"))
input_mask <- t(replicate(batch_size, input_mask, simplify = "matrix"))
segment_ids <- t(replicate(batch_size, segment_ids, simplify = "matrix"))

bert_output <- bert_model %>% 
  predict(list(input_word_ids,
               input_mask,
               segment_ids),
          batch_size = batch_size)

str(bert_output)
# List of 2
# $ : num [1, 1:768] -0.5351 -0.2612 -0.2459 0.4087 -0.0803 ...
# $ : num [1, 1:128, 1:768] 0.17 0.593 -0.427 -0.573 0.347 ...

I'm still working on understanding the output. The embeddings I get this way don't match the embeddings I get when running BERT models other ways, though they "look correlated."

Hope this helps!