rstudio / tfhub

R interface to TensorFlow Hub
28 stars 7 forks source link

How to add layer_gru to tf_hub models? #4

Closed turgut090 closed 5 years ago

turgut090 commented 5 years ago

Hi @dfalbel , this package is extremely helpful!! Thank you for the enormous contribution!

I have 2 issues regarding examples:

1) First is related to https://github.com/rstudio/tfhub/blob/master/vignettes/examples/feature_column.R

# Build the feature spec --------------------------------------------------

spec <- dataset_train %>%
  feature_spec(AdoptionSpeed ~ .) %>%
  step_text_embedding_column(
    Description,
    module_spec = "https://tfhub.dev/google/universal-sentence-encoder/2"
    ) %>%
  step_image_embedding_column(
    img,
    module_spec = "https://tfhub.dev/google/imagenet/resnet_v2_50/feature_vector/3"
  ) %>%
  step_numeric_column(Age, Fee, Quantity, normalizer_fn = scaler_standard()) %>%
  step_categorical_column_with_vocabulary_list(
    has_type("string"), -Description, -RescuerID, -img_path, -PetID, -Name
  ) %>%
  step_embedding_column(Breed1:Health, State)

I could not find in tfdatasets step_text_embedding_column. Did you mean hub_text_embedding_column() from tfhub ?

2) How can I reshape input of tf hub models? For example, LSTM requires 3d input. So, for text classification, I should reshape the input from tf hub model into 3d. Is __k_reshape__ is the key?

embeddings <- layer_hub(
  handle = "https://tfhub.dev/google/tf2-preview/nnlm-en-dim128-with-normalization/1",
  trainable = F
)

input <- layer_input(shape = shape(), dtype = "string")

output <- input %>%
  embeddings() %>%
  #keras::k_reshape(shape = c(1,1,4096)) %>% 
  bidirectional(layer_lstm(units = 80,return_sequences = T)) %>% 
  layer_global_average_pooling_1d() %>% 
  layer_dense(units = 32, activation = "relu") %>%
  layer_dense(units = 6, activation = "sigmoid")
dfalbel commented 5 years ago

Hi @henry090 thanks very much!

For 1). It should live in tfdatasets. I think it's still a PR

For 2) You can't call k_reshape directly, but you should be able to use layer_lambda to wrap it.

turgut090 commented 5 years ago

@dfalbel , according to @ vbardiovskyg, this step could help to reshape tf hub model into 3d but I still experience some issues.

Assuming that after preprocessing, your string tensor is a dense tensor (this will be needed to feed into LSTM anyway), you can reshape to [None] before passing to the module, then reshape back:

words = tf.constant(["cat is on the mat".split(), "dog is in the fog".split()])
words = tf.reshape(words, [-1])
result = embed(words)
result = tf.reshape(result, [2, 5, 128]) # the second array can be constructed with tf.concat, tf.shape(words) and [-1].

But in Keras, shoud it be in the following form?

input <- layer_input(shape = shape(), dtype = "string")

output = embeddings(input) 

output = layer_lambda(f = function(x) {(k_reshape(x,shape = list(-1,-1,128)))}) (c(output))

output = output %>% bidirectional(layer_lstm(units = 80,return_sequences = T)) %>% 
  layer_global_average_pooling_1d() %>% 
  layer_dense(units = 32, activation = "relu") %>%
  layer_dense(units = 6, activation = "sigmoid")

Output

2019-08-23 23:17:54.283133: W tensorflow/core/framework/op_kernel.cc:1546] OP_REQUIRES failed at reshape_op.h:53 : Invalid argument: Only one input size may be -1, not both 0 and 1
2019-08-23 23:17:54.283327: W tensorflow/core/common_runtime/base_collective_executor.cc:216] BaseCollectiveExecutor::StartAbort Invalid argument: Only one input size may be -1, not both 0 and 1
     [[{{node lambda_17/Reshape}}]]
 Error in py_call_impl(callable, dots$args, dots$keywords) : 
  InvalidArgumentError:  Only one input size may be -1, not both 0 and 1
     [[node lambda_17/Reshape (defined at /keras/engine/training.py:643) ]] [Op:__inference_keras_scratch_graph_1243]

Function call stack:
keras_scratch_graph 

UPDATE*** I changed the shape to the following form and it works. Is it OK?

input <- layer_input(shape = shape(), dtype = "string")

output = embeddings(input) 

output = layer_lambda(f = function(x) {(k_reshape(x,shape = list(-1,1,128)))})(c(output))

output =output %>%  bidirectional(layer_lstm(units = 80,return_sequences = T)) %>% 
  layer_global_average_pooling_1d() %>% 
  layer_dense(units = 32, activation = "relu") %>%
  layer_dense(units = 6, activation = "sigmoid")