t-kalinowski / deep-learning-with-R-2nd-edition-code

Code from the book "Deep Learning with R, 2nd Edition"
https://blogs.rstudio.com/ai/posts/2022-05-31-deep-learning-with-r-2e/
54 stars 22 forks source link

"object 'optimizer' not found" error when fit() custom model #6

Open ggeeoorrgg opened 1 year ago

ggeeoorrgg commented 1 year ago

Hi! It looks like compile() ignores an optimizer argument when compiling/training a custom model. When i try this code: model %>% compile(optimizer = optimizer_rmsprop()) (766th row in the book`s code) it falls with an error: _"Error in py_call_impl(callable, call_args$unnamed, callargs$named) : RuntimeError: in user code: .... RuntimeError: object 'optimizer' not found". Instead of a passed argument it takes an optimizer variable from parent environment (Global environment). In other words, it needs to define in advance: _optimizer <- optimizerrmsprop(), then model is training as it should be. Is this OK? Any thoughts?

t-kalinowski commented 1 year ago

Hmm, that sounds like a bug. I will investigate. Thanks for reporting!

t-kalinowski commented 1 year ago

766th row in the book`s code

Which chapter?

ggeeoorrgg commented 1 year ago

Which chapter?

ch07

t-kalinowski commented 1 year ago

It looks like starting with TensorFlow 2.11, custom train_step() methods now are compiled differently, such that the optimizer must know in advance which variables it will be modifying. Essentially, optimizer$build(model$variables) needs to be called before the first time fit() is called.

We will need to investigate further (and update the corresponding guides on tensorflow.rstudio.com).

Updating the CustomModel definition on line 736 of chapter 7 to this works:

## -------------------------------------------------------------------------
loss_fn <- loss_sparse_categorical_crossentropy()
loss_tracker <- metric_mean(name = "loss")

CustomModel <- new_model_class(
  classname = "CustomModel",

  compile = function(optimizer, loss_fn, ...) {
    super$compile(...)
    optimizer$build(self$variables)
    self$optimizer <- optimizer
    self$loss_fn <- loss_fn
  },

  train_step = function(data) {
    c(inputs, targets) %<-% data
    with(tf$GradientTape() %as% tape, {
      predictions <- self(inputs, training = TRUE)
      loss <- self$loss_fn(targets, predictions)
    })
    gradients <- tape$gradient(loss, model$trainable_weights)
    self$optimizer$apply_gradients(zip_lists(gradients, model$trainable_weights))

    loss_tracker$update_state(loss)
    list(loss = loss_tracker$result())
  },

  metrics = mark_active(function() list(loss_tracker))
)

## -------------------------------------------------------------------------
inputs <- layer_input(shape=c(28 * 28))
features <- inputs %>%
  layer_dense(512, activation="relu") %>%
  layer_dropout(0.5)
outputs <- features %>%
  layer_dense(10, activation="softmax")

model <- CustomModel(inputs = inputs, outputs = outputs)

model %>% compile(optimizer = optimizer_rmsprop(), loss = loss_fn)
model %>% fit(train_images, train_labels, epochs = 3)