rstudio / tfdatasets

R interface to TensorFlow Datasets API
https://tensorflow.rstudio.com/tools/tfdatasets/
34 stars 12 forks source link

Unable to save and load models defined via the feature spec interface #53

Closed burrisk closed 3 years ago

burrisk commented 5 years ago

When following along with the feature spec vignette provided here, I had trouble loading the model for re-use after saving it. Below is a reproducible example and the error that I'm getting. I think I may need to specify the custom_objects argument in the load_model_hdf5 function, but proper use is unclear for this example. Any help would be much appreciated!

library(tfdatasets)
library(rsample)
library(keras)

data(hearts)

split <- initial_split(hearts)
dataset_train <- training(split)
dataset_test <- testing(split)

spec <- feature_spec(dataset_train, target ~ .) %>% 
  step_numeric_column(
    all_numeric(), -cp, -restecg, -exang, -sex, -fbs,
    normalizer_fn = scaler_standard()
  ) %>% 
  step_categorical_column_with_vocabulary_list(thal)  %>% 
  step_bucketized_column(age, boundaries = c(18, 25, 30, 35, 40, 45, 50, 55, 60, 65))  %>% 
  step_indicator_column(thal)  %>% 
  step_embedding_column(thal, dimension = 2L)  %>% 
  step_crossed_column(c(thal, bucketized_age), hash_bucket_size = 10) %>%
  step_indicator_column(crossed_thal_bucketized_age)

spec_prep <- fit(spec)

input <- layer_input_from_dataset(dataset_train %>% select(-target))
feature_layer <-  dense_features(spec_prep)

output <- input %>% 
  layer_dense_features(feature_layer) %>% 
  layer_dense(units = 32, activation = "relu") %>% 
  layer_dense(units = 1, activation = "sigmoid")

model <- keras_model(input, output)

model %>% compile(
  loss = loss_binary_crossentropy, 
  optimizer = optimizer_adam(), 
  metrics = "binary_accuracy"
)

history <- model %>% 
  fit(dataset_train %>% select(-target), dataset_train$target, epochs = 20, validation_split = 0.2)

save_model_hdf5(model, "test.h5")
load_model_hdf5("test.h5")
Error in py_call_impl(callable, dots$args, dots$keywords) : 
  AttributeError: 'NoneType' object has no attribute 'get'

Detailed traceback: 
  File "/usr/local/lib/python3.6/site-packages/tensorflow_core/python/keras/saving/save.py", line 143, in load_model
    return hdf5_format.load_model_from_hdf5(filepath, custom_objects, compile)
  File "/usr/local/lib/python3.6/site-packages/tensorflow_core/python/keras/saving/hdf5_format.py", line 162, in load_model_from_hdf5
    custom_objects=custom_objects)
  File "/usr/local/lib/python3.6/site-packages/tensorflow_core/python/keras/saving/model_config.py", line 55, in model_from_config
    return deserialize(config, custom_objects=custom_objects)
  File "/usr/local/lib/python3.6/site-packages/tensorflow_core/python/keras/layers/serialization.py", line 98, in deserialize
    printable_module_name='layer')
  File "/usr/local/lib/python3.6/site-packages/tensorflow_core/python/keras/utils/generic_utils.py", line 191, in deserialize_keras_object
    list(custom_objects.items())))
  File "/usr/local
mfiorina commented 4 years ago

Hello @burrisk, did you ever find a solution to this issue? I've just run into exactly the same problem and am getting the same AttributeError message.

burrisk commented 4 years ago

Never found a solution to the issue. Ending up not using tfdatasets in the example because of it.

mfiorina commented 4 years ago

Thank you for your answer @burrisk!

@dfalbel (please let me know if I should tag someone else), would it be possible to get some help with this? I've run into the same issue. Reproducible code below:

library(tfdatasets)
library(rsample)
library(reticulate)
library(tensorflow)
library(tidyverse)
library(keras)

  data(hearts)

  sample <- sample.int(n = nrow(hearts), size = floor(.9 * nrow(hearts)), replace = FALSE)    

  hearts_train <- hearts[sample, ]

  hearts_test  <- hearts[-sample, ]  

  hearts_train_labels <- hearts_train$target

  hearts_test_labels  <- hearts_test$target  

  hearts_train <- hearts_train %>%

      select(-target)

  hearts_test  <- hearts_test %>%

      select(-target)

  column_names <- names(hearts_train)  

  hearts_train <- hearts_train %>%

      as_tibble(.name_repair = "minimal") %>%

      setNames(column_names) %>%

      mutate(label = hearts_train_labels)

  hearts_test <- hearts_test %>%

      as_tibble(.name_repair = "minimal") %>%

      setNames(column_names) %>%

      mutate(label = hearts_test_labels)

  hearts_spec <- feature_spec(hearts_train, label ~ .) %>%

      step_numeric_column(all_numeric(), normalizer_fn = scaler_standard()) %>%

      fit()

  hearts_spec  

  build_model <- function(train_ds, spec) {

      input <- layer_input_from_dataset(train_ds %>% select(-label))

      output <- input %>%

          layer_dense_features(dense_features(spec), dtype = tf$float32) %>%

          layer_dense(units = 64, activation = "relu") %>%

          layer_dropout(.4) %>%

          layer_dense(units = 64, activation = "relu") %>%

          layer_dropout(.4) %>%

          layer_dense(units = 64, activation = "relu") %>%

          layer_dropout(.4) %>%

          layer_dense(units = 1)

      model <- keras_model(input, output)

      model %>%

          compile(loss      = "mse",

                  optimizer = "adam",

                  metrics   = list("mean_absolute_error"))

      model

  }

  model <- build_model(hearts_train, hearts_spec)

  summary(model)

  history <- model %>% fit(

    x = hearts_train %>% select(-label),

    y = hearts_train$label,

    epochs = 100,

    batchsize = 64,

    validation_split = 0.2

  )

  model %>% save_model_tf("~/Desktop/model")

  new_model <- load_model_tf("~/Desktop/model")

Console:

Error in py_call_impl(callable, dots$args, dots$keywords) : 
  AttributeError: 'NoneType' object has no attribute 'get'

Detailed traceback: 
  File "/Users/marc-andreafiorina/Library/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/tensorflow/python/keras/saving/save.py", line 190, in load_model
    return saved_model_load.load(filepath, compile)
  File "/Users/marc-andreafiorina/Library/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 116, in load
    model = tf_load.load_internal(path, loader_cls=KerasObjectLoader)
  File "/Users/marc-andreafiorina/Library/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/tensorflow/python/saved_model/load.py", line 604, in load_internal
    export_dir)
  File "/Users/marc-andreafiorina/Library/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 188, in __init__
    super(KerasObjectLoader, self).__init__(*args, **kwargs)
  File "/Users/marc-andreafiorina/
> tf_config()
TensorFlow v2.2.0 ()
Python v3.6 (~/Library/r-miniconda/envs/r-reticulate/bin/python)

This is run on a Macbook. Installed keras, reticulate, and tensorflow packages using remotes::install_github()

mfiorina commented 4 years ago

Extended traceback for the more python savvy than myself:

6: stop(structure(list(message = "AttributeError: 'NoneType' object has no attribute 'get'
Detailed traceback: 
  File \"/Users/marc-andreafiorina/Library/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/tensorflow/python/keras/saving/save.py\", line 190, in load_model
    return saved_model_load.load(filepath, compile)
  File \"/Users/marc-andreafiorina/Library/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/tensorflow/python/keras/saving/saved_model/load.py\", line 116, in load
    model = tf_load.load_internal(path, loader_cls=KerasObjectLoader)
  File \"/Users/marc-andreafiorina/Library/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/tensorflow/python/saved_model/load.py\", line 604, in load_internal
    export_dir)
  File \"/Users/marc-andreafiorina/Library/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/tensorflow/python/keras/saving/saved_model/load.py\", line 188, in __init__
    super(KerasObjectLoader, self).__init__(*args, **kwargs)
  File \"/Users/marc-andreafiorina/Library/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/tensorflow/python/saved_model/load.py\", line 123, in __init__
    self._load_all()\n  File \"/Users/marc-andreafiorina/Library/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/tensorflow/python/keras/saving/saved_model/load.py\", line 209, in _load_all
    self._layer_nodes = self._load_layers()
  File \"/Users/marc-andreafiorina/Library/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/tensorflow/python/keras/saving/saved_model/load.py\", line 309, in _load_layers
    layers[node_id] = self._load_layer(proto.user_object, node_id)
  File \"/Users/marc-andreafiorina/Library/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/tensorflow/python/keras/saving/saved_model/load.py\", line 335, in _load_layer
    obj, setter = self._revive_from_config(proto.identifier, metadata, node_id)
  File \"/Users/marc-andreafiorina/Library/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/tensorflow/python/keras/saving/saved_model/load.py\", line 353, in _revive_from_config
    self._revive_layer_from_config(metadata, node_id))
  File \"/Users/marc-andreafiorina/Library/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/tensorflow/python/keras/saving/saved_model/load.py\", line 408, in _revive_layer_from_config
    generic_utils.serialize_keras_class_and_config(class_name, config))
  File \"/Users/marc-andreafiorina/Library/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/tensorflow/python/keras/layers/serialization.py\", line 109, in deserialize
    printable_module_name='layer')
  File \"/Users/marc-andreafiorina/Library/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/tensorflow/python/keras/utils/generic_utils.py\", line 373, in deserialize_keras_object
    list(custom_objects.items())))
  File \"/Users/marc-andreafiorina/Library/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/tensorflow/python/feature_column/feature_column_v2.py\", line 471, in from_config
    config['feature_columns'], custom_objects=custom_objects)
  File \"/Users/marc-andreafiorina/Library/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/tensorflow/python/feature_column/serialization.py\", line 190, in deserialize_feature_columns
    for c in configs
  File \"/Users/marc-andreafiorina/Library/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/tensorflow/python/feature_column/serialization.py\", line 190, in <listcomp>
    for c in configs
  File \"/Users/marc-andreafiorina/Library/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/tensorflow/python/feature_column/serialization.py\", line 142, in deserialize_feature_column
    columns_by_name=columns_by_name)
  File \"/Users/marc-andreafiorina/Library/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/tensorflow/python/feature_column/feature_column_v2.py\", line 2944, in from_config
    config['normalizer_fn'], custom_objects=custom_objects)
  File \"/Users/marc-andreafiorina/Library/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/tensorflow/python/keras/utils/generic_utils.py\", line 390, in deserialize_keras_object
    obj = module_objects.get(object_name)
", 
       call = py_call_impl(callable, dots$args, dots$keywords), 
       cppstack = structure(list(file = "", line = -1L, stack = c("1   reticulate.so                       0x000000010518aace _ZN4Rcpp9exceptionC2EPKcb + 222", 
       "2   reticulate.so                       0x0000000105192ba5 _ZN4Rcpp4stopERKNSt3__112basic_stringIcNS0_11char_traitsIcEENS0_9allocatorIcEEEE + 53", 
       "3   reticulate.so                       0x000000010519fffb _Z12py_call_impl11PyObjectRefN4Rcpp6VectorILi19ENS0_15PreserveStorageEEES3_ + 795", 
       "4   reticulate.so                       0x0000000105180ab4 _reticulate_py_call_impl + 132", 
       "5   libR.dylib                          0x0000000100d22922 R_doDotCall + 1458", 
       "6   libR.dylib                          0x0000000100d6e7fa bcEval + 105338", 
       "7   libR.dylib                          0x0000000100d54541 Rf_eval + 385", 
       "8   libR.dylib                          0x0000000100d74a71 R_execClosure + 2193", 
       "9   libR.dylib                          0x0000000100d73849 Rf_applyClosure + 473", 
       "10  libR.dylib                          0x0000000100d5b728 bcEval + 27304", 
       "11  libR.dylib                          0x0000000100d54541 Rf_eval + 385", 
       "12  libR.dylib                          0x0000000100d74a71 R_execClosure + 2193", 
       "13  libR.dylib                          0x0000000100d73849 Rf_applyClosure + 473", 
       "14  libR.dylib                          0x0000000100d54a16 Rf_eval + 1622", 
       "15  libR.dylib                          0x0000000100cf06cf do_docall + 639", 
       "16  libR.dylib                          0x0000000100d5be78 bcEval + 29176", 
       "17  libR.dylib                          0x0000000100d54541 Rf_eval + 385", 
       "18  libR.dylib                          0x0000000100d74a71 R_execClosure + 2193", 
       "19  libR.dylib                          0x0000000100d73849 Rf_applyClosure + 473", 
       "20  libR.dylib                          0x0000000100d5b728 bcEval + 27304", 
       "21  libR.dylib                          0x0000000100d54541 Rf_eval + 385", 
       "22  libR.dylib                          0x0000000100d74a71 R_execClosure + 2193", 
       "23  libR.dylib                          0x0000000100d73849 Rf_applyClosure + 473", 
       "24  libR.dylib                          0x0000000100d5b728 bcEval + 27304", 
       "25  libR.dylib                          0x0000000100d54541 Rf_eval + 385", 
       "26  libR.dylib                          0x0000000100d74a71 R_execClosure + 2193", 
       "27  libR.dylib                          0x0000000100d73849 Rf_applyClosure + 473", 
       "28  libR.dylib                          0x0000000100d54a16 Rf_eval + 1622", 
       "29  libR.dylib                          0x0000000100d7829d do_set + 2749", 
       "30  libR.dylib                          0x0000000100d54739 Rf_eval + 889", 
       "31  libR.dylib                          0x0000000100da9cba Rf_ReplIteration + 810", 
       "32  libR.dylib                          0x0000000100dab1df run_Rmainloop + 207", 
       "33  rsession                            0x0000000100906960 _ZN13rstudio_boost4asio6detail24descriptor_write_op_baseINS0_15const_buffers_1EE10do_performEPNS1_10reactor_opE + 643200", 
       "34  rsession                            0x00000001008df3d9 _ZN13rstudio_boost4asio6detail24descriptor_write_op_baseINS0_15const_buffers_1EE10do_performEPNS1_10reactor_opE + 482041", 
       "35  rsession                            0x0000000100153472 _ZN13rstudio_boost4asio6detail30reactive_socket_accept_op_baseINS0_12basic_socketINS0_2ip3tcpEEES5_E10do_performEPNS1_10reactor_opE + 399250", 
       "36  libdyld.dylib                       0x00007fff598b9015 start + 1"
       )), class = "Rcpp_stack_trace")), class = c("Rcpp::exception", 
   "C++Error", "error", "condition")))
5: py_call_impl(callable, dots$args, dots$keywords)
4: (structure(function (...) 
   {
       dots <- py_resolve_dots(list(...))
       result <- py_call_impl(callable, dots$args, dots$keywords)
       if (convert) 
           result <- py_to_r(result)
       if (is.null(result)) 
           invisible(result)
       else result
   }, class = c("python.builtin.function", "python.builtin.object"
   ), py_object = <environment>))(filepath = "/Users/marc-andreafiorina/Desktop/model", 
       custom_objects = NULL, compile = TRUE)
3: do.call(keras$models$load_model, args)
2: load_model(filepath, custom_objects, compile)
1: load_model_tf("model")
sgvignali commented 3 years ago

I faced exactly the same problem, the same error is raised with both functions: save_model_tf() and save_model_hdf5():

Error in py_call_impl(callable, dots$args, dots$keywords) : 
  AttributeError: 'NoneType' object has no attribute 'get' 

I don't provide any example because the ones posted before are already good examples. Could please someone have a look at this problem?

sgvignali commented 3 years ago

Digging a bit more into this problem, it seems to happen when using the normalizer_fn argument in step_numeric_column.

I provide a reproducible and simple example. With the following code the error is raised:

data(hearts)
data <- hearts %>% select(age, target)

spec <- feature_spec(data, target ~ .) %>% 
  step_numeric_column(age, normalizer_fn = scaler_standard()) %>% 
  fit()

input <- layer_input_from_dataset(data %>% select(-target))
output <- input %>% 
  layer_dense_features(dense_features(spec)) %>% 
  layer_dense(units = 1, activation = "sigmoid")
model <- keras_model(input, output)

model %>% compile(
  loss = "binary_crossentropy", 
  optimizer = "adam", 
  metrics = "binary_accuracy"
)

# All the following fail with the same error
save_model_tf(model, "my_model")
new_model <- load_model_tf("my_model")

save_model_hdf5(model, "my_model.h5")
new_model <- load_model_hdf5("my_model.h5")

old_model <- serialize_model(model)
new_model <- unserialize_model(old_model)

The same happens with scaler_min_max(). Removing the normalizer_fn argument it works smooth:

data(hearts)
data <- hearts %>% select(age, target)

spec <- feature_spec(data, target ~ .) %>% 
  step_numeric_column(age) %>% 
  fit()

input <- layer_input_from_dataset(data %>% select(-target))
output <- input %>% 
  layer_dense_features(dense_features(spec)) %>% 
  layer_dense(units = 1, activation = "sigmoid")
model <- keras_model(input, output)

model %>% compile(
  loss = "binary_crossentropy", 
  optimizer = "adam", 
  metrics = "binary_accuracy"
)

# All the following works fine
save_model_tf(model, "my_model")
new_model <- load_model_tf("my_model")

save_model_hdf5(model, "my_model.h5")
new_model <- load_model_hdf5("my_model.h5")

old_model <- serialize_model(model)
new_model <- unserialize_model(old_model)

I've also tested the following specifications, that includes other column types, so the problem seems really caused by the normalizer function. Same, removing the normalizer_fn argument the error doesn't occur:

spec <- feature_spec(hearts, target ~ .) %>% 
  step_numeric_column(
    all_numeric(), -cp, -restecg, -exang, -sex, -fbs
  ) %>% 
  step_categorical_column_with_vocabulary_list(thal) %>% 
  step_bucketized_column(age, boundaries = c(18, 25, 30, 35, 40, 45, 50, 55, 60, 65)) %>% 
  step_indicator_column(thal) %>% 
  step_embedding_column(thal, dimension = 2) %>% 
  step_crossed_column(c(thal, bucketized_age), hash_bucket_size = 10) %>%
  step_indicator_column(crossed_thal_bucketized_age) %>% 
  fit()

I hope this helps to solve the issue, It would be very important to be able to save the model. It would be also nice to save the fitted specifications somehow, is there a way to save them?