rstudio / tfprobability

R interface to TensorFlow Probability
https://rstudio.github.io/tfprobability/
Other
54 stars 16 forks source link

Prediction is not possible after tf$compat$v1$disable_v2_behavior() #156

Open TomasVrzal opened 2 years ago

TomasVrzal commented 2 years ago

I have the following problem:

I have pretrained model, however, when I want to load it (from disk) I must run the following code in order to disable v2 behavior:

tensorflow::tf$compat$v1$disable_v2_behavior() Otherwise, the following error occurs: Error in py_call_impl(callable, dots$args, dots$keywords) : TypeError: Expected `trainable` argument to be a boolean, but got: None

This is not the main problem, but after training the model with tfprobability (I used layer_distribution_lambda as the last layer) I am not able to perform prediction:

pr <- model %>% predict(test_images[1:10,,,]) Error in py_call_impl(callable, dots$args, dots$keywords) : ValueError: Error when checking input: expected conv2d_8_input to have 4 dimensions, but got array with shape (10, 28, 28) The error says that the data is in the wrong shape ( NO IT IS NOT!!). This error occurs only when v2 behaviour is disabled.

Can you help me please with how to solve this problem?

t-kalinowski commented 2 years ago

Hi, thanks for filing.

Can you help me reproduce the error on my machine? I would need a minimal code snippet that defines a model, saves it, loads it, and calls predict() to generate the error.

TomasVrzal commented 2 years ago

Thank you for your reply. I will show you my problem with this toy example (MNIST dataset, model is transformed to regression just for learning purposes): (this code is without problem when I do not run disable_v2_behavior) -however I need to run it for loading one of my pretrained model (github/DeepReI)

tensorflow::tf$compat$v1$disable_v2_behavior()
library(keras)

mnist <- dataset_mnist()
train_images <- mnist$train$x
train_labels <- mnist$train$y
test_images <- mnist$test$x
test_labels <- mnist$test$y

train_images <- array_reshape(train_images, c(60000, 28, 28, 1))
train_images <- train_images / 255

test_images <- array_reshape(test_images, c(10000, 28, 28, 1))
test_images <- test_images / 255

train_labels <- array_reshape(train_labels, c(60000, 1))
test_labels <- array_reshape(test_labels, c(10000, 1))

model <- keras_model_sequential() %>%
  layer_conv_2d(filters = 32, kernel_size = c(3, 3), activation = "relu",
                input_shape = c(28, 28, 1)) %>%
  layer_max_pooling_2d(pool_size = c(2, 2)) %>%
  layer_conv_2d(filters = 64, kernel_size = c(3, 3), activation = "relu") %>%
  layer_max_pooling_2d(pool_size = c(2, 2)) %>%
  layer_conv_2d(filters = 64, kernel_size = c(3, 3), activation = "relu")

model <- model %>%
  layer_flatten() %>%
  layer_dense(units = 64, activation = "relu") %>%
  layer_dense(units = 2, activation = "linear") %>% 
layer_distribution_lambda(function(x)
  tfd_normal(loc = x[, 1, drop = FALSE],
             # ignore on first read, we'll come back to this
             # scale = 1e-3 + 0.05 * tf$math$softplus(x[, 2, drop = FALSE])
             scale = 1e-3 + tf$math$softplus(x[, 2, drop = FALSE])
  )
)

negloglik <- function(y, model) - (model %>% tfd_log_prob(y))

model %>% compile(
  optimizer = "adam",
  loss = negloglik,
  metrics = c("mae")
)
model %>% fit(
  train_images, train_labels,
  epochs = 200, batch_size=64, validation_split = 0.2
)

pr <- model %>% predict(test_images[1:10,,,])
t-kalinowski commented 2 years ago

There are two issues in the reprex:

  1. In the predict call, you're passing an array of the wrong shape. You need to slice with drop = FALSE to prevent dropping the last size-1 axis (test_images[1:10,,,,drop=FALSE])

  2. In your fit call, you need to explicitly cast train_labels to double as.double, otherwise it comes in as an integer and then fails.

After that, the issue encountered is:

TypeError: Can not convert a Normal into a Tensor or Operation.

Which is unfortunately something that is introduced by the Python tensorflow_probability package.

One thing you can do is you can re-save your model so that it doesn't required the tensorflow::tf$compat$v1$disable_v2_behavior() to work. To do this, call:

tensorflow::tf$compat$v1$disable_v2_behavior()
model <- ... # load_model
save_model_weights_tf(model, "model_weights.tf")

And then in a fresh R session, without the disable_v2_behavior() call, run the code that defines the model, and just reload the weights to the new model:

model <- ... # Run the R code that defines the same model, but doesn't fit it.
model <- keras::load_model_weights_tf(model, "model_weights.tf")

# now predict works
pr <- model %>% predict(test_images[1:10,,,,drop=F])

# resave the model with the latest TF/TFP so you don't have to do this dance every time:
save_model_tf(model, "same_model_new_format.tf")
TomasVrzal commented 2 years ago

Thank you very much, it helps.