rstudio / tfdatasets

R interface to TensorFlow Datasets API
https://tensorflow.rstudio.com/tools/tfdatasets/
34 stars 12 forks source link

Access normalizer_fn #80

Open sgvignali opened 3 years ago

sgvignali commented 3 years ago

Is there a way to access the values learned by a fitted feature specification? Like the mean and standard deviation learned by the scaler_standard() function, or the values for vocabulary of categorical variables?

Thanks, Sergio

sgvignali commented 3 years ago

I reply to myself. Let's say that the first step is a step_numeric_column, the normalizer function can be accessed so:

spec$steps[[1]]$normalizer_fn

and the mean and sd as following:

as.list(environment(spec$steps[[1]]$normalizer_fn))

The problem is that the mean and the sd are defined in the StandardScaler object that is not stored in the StepNumericColumn (only the function is stored after calling the method fit_resume).

With this setting, the object cannot be saved, and in a new R session the pointer to the normalizer function is lost. It would be important to be able to save the specification, in order to avoid the need of re-fitting the object.

The same applies to the min_max_scaler function. Maybe a possible solution would be to store the whole StandardScaler object and call the function only in the feature method?

Could please someone have a look at it?