google / yggdrasil-decision-forests

A library to train, evaluate, interpret, and productionize decision forest models such as Random Forest and Gradient Boosted Decision Trees.
https://ydf.readthedocs.io/
Apache License 2.0
498 stars 53 forks source link

AttributeError: 'RandomForestLearner' object has no attribute 'to_tensorflow_saved_model' #135

Closed gmrparcerao closed 1 month ago

gmrparcerao commented 1 month ago

I've trained a model using RandomForestLearner and now I need to save it to a TensorFlow saved model, so I can convert to a TFLite model, but when I try to do this, the code returns a AttributeError: 'RandomForestLearner' object has no attribute 'to_tensorflow_saved_model'. I checked YDF documentation and various forums, and it's supposed that YDF would have this attribute. Before start, I installed YDF using the command:

pip install ydf -U

I'm running the code in the cloud, using a Jupyter Notebook environment with Python, via Kaggle Notebooks. Here are the machine specifications according to Kaggle documentation:

Here's the code block who trained the model, which executed without errors:

rf_model = ydf.RandomForestLearner(label='class')
rf_eval = rf_model.cross_validation(df_fft_general)
rf_eval

Here's the code that's supposed to save the model as a TensorFlow saved model and convert it to a TFLite model next, which is generating the error:

rf_tf_model = rf_model.to_tensorflow_saved_model("/kaggle/working/rf_model")
converter = tf.lite.TFLiteConverter(rf_tf_model)
rf_tflite_model = converter.convert()
open("kaggle/working/rf_model.tflite", "wb").write(rf_tflite_model)
!cd /kaggle/working && xxd -i rf_model.tflite > rf_model.cc

Here's the error:


---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[33], line 1
----> 1 rf_tf_model = rf_model.to_tensorflow_saved_model("/kaggle/working/rf_model")
      2 converter = tf.lite.TFLiteConverter(rf_tf_model)
      3 rf_tflite_model = converter.convert()

AttributeError: 'RandomForestLearner' object has no attribute 'to_tensorflow_saved_model'

YDF version: 0.8.0 TensorFlow version: 2.15.0

rstz commented 1 month ago

YDF distinguishes between a Learner and a Model: Fundamentally, a Learner is an object that takes a set of hyperparameters (in the constructor) and a dataset (in the train() method) to produce a Model. A Model takes a dataset (in the predict() method) and computes a prediction for every row of the dataset. You can also do other things with Learner and Model (e.g. cross_validation() of a learner or evaluate() on a Model), but this is the basic idea.

cross_validation() is a method of a Learner that returns an evaluation report by training and evaluating many models on a dataset according to the cross-validation technique. The evaluation report informs you about the quality of a model trained with this learner. Importantly, cross_validation() does not return a model and it does not modify the Learner in any way. This is whyto_tensorflow_saved_model() does not exist on the learner - there's simply no model yet.

If you want to compute a model with your RandomForestLearner and export it to Tensorflow, call

rf_learner = ydf.RandomForestLearner(label='class')
rf_model = rf_learner.train(df_fft_general)
rf_model.to_tensorflow_saved_model("/kaggle/working/rf_model")

This will store a Keras 2 model at /kaggle/working/rf_model. Note that the function returns None. In order to use the Tensorflow model, you have to load it from disk.

===================

From the rest of your code, it looks like you want to export this model to the TFLite format. Unfortunately, this will not work with Random Forest models - the TFLiteConverter is missing the required "op" to deal with these models.

For GradientBoostedTreesModels, there is a workaround, if you're happy to play around a bit: You can export the model to a pure JAX function that uses the subset of ops supported by TFLite. This model can then be converted to a TFLite model:

# Train a Gradient Boosted Trees model
gbt_learner = ydf.GradientBoostedTreesLearner(label='class')
gbt_model = gbt_learner.train(df_fft_general)

# Convert the model to Jax
jax_model = gbt_model.to_jax_function(compatibility="TFL")

# Convert a Jax model to a TensorFlow model.
tf_model = tf.Module()
tf_model.predict = tf.function(
    jax2tf.convert(jax_model.predict, with_gradient=False),
    jit_compile=True,
    autograph=False,
)

# Convert the Tensorflow model to a TFLite model
selected_examples = test_ds[:1].drop(model.label(), axis=1)
input_values = jax_model.encoder(selected_examples)
tf_input_specs = {
    k: tf.TensorSpec.from_tensor(tf.constant(v), name=k)
    for k, v in input_values.items()
}
concrete_predict = tf_model.predict.get_concrete_function(tf_input_specs)
converter = tf.lite.TFLiteConverter.from_concrete_functions(
    [concrete_predict], tf_model
)
converter.target_spec.supported_ops = [
    tf.lite.OpsSet.TFLITE_BUILTINS,  # enable TensorFlow Lite ops.
    tf.lite.OpsSet.SELECT_TF_OPS,  # enable TensorFlow ops.
]
tflite_model = converter.convert()

We are exploring