tensorflow / decision-forests

A collection of state-of-the-art algorithms for the training, serving and interpretation of Decision Forest models in Keras.
Apache License 2.0
652 stars 105 forks source link

model.predict() method returning only 0 when unpickling models and using TFDF 1.2.0 #169

Open waral opened 1 year ago

waral commented 1 year ago

Hello,

I have the following problem: I noticed that when using the new version (1.2.0) of Tensorflow Decision Forests, if I pickle a trained model (gradient boosted trees), then load it and run the predict method, I constantly obtain 0.0, independently of the input. Note that running the exact same code with an older version of the package (1.0.1) the problem doesn't seem to occur.

I also noticed that TFDF 1.0.1 gets installed with Tensorflow 2.10.1, while TFDF 1.2.0 with Tensorflow 2.11.1; giving you this info in case this is something more generally related to the Tensorflow version.

Are you aware of this issue? Is pickling/unpickling supposed to be working with TFDF models?

I understand that pickle is not the usual go-to saving method when working with Tensorflow models, however I'm working on a project where different frameworks are used, so I need a general way of saving models, regardless of the framework, that's why I'm exploring this.

Thank you so much and let me know if you have any additional questions.

Best, Michal

rstz commented 1 year ago

Hi Michal,

I'm not really familiar with the way pickle works (in fact, I've never used it), but I'm happy to take a closer look if you can give me a minimum working example. So if this is my model

!pip install tensorflow_decision_forests -U -qq
import tensorflow as tf
import tensorflow_decision_forests as tfdf
import pandas as pd

# Download the dataset, load it into a pandas dataframe and convert it to TensorFlow format.
!wget -q https://storage.googleapis.com/download.tensorflow.org/data/palmer_penguins/penguins.csv -O /tmp/penguins.csv
dataset_df = pd.read_csv("/tmp/penguins.csv")
train_ds = tfdf.keras.pd_dataframe_to_tf_dataset(dataset_df, label="species")

# Create, train and save the model
model = tfdf.keras.GradientBoostedTreesModel()
model.fit(train_ds)

What steps do I need to do to pickle the model and unpickle it to get the predictions?

waral commented 1 year ago

Hi,

thanks so much for the quick reply!

So the full minimal example would look like the following. I actually tested it locally and getting the same problem, meaning that with TFDF 1.2.0 I'm getting the array of 0's as the output, while with 1.0.1 the output seems correct. Note that in this specific example (because it's multi-class classification, not sure whether TFDF works properly with multi-class anyway?), the shapes don't match, too. I.e. there are three numbers per one input when using 1.0.1 and only one number (0) when using 1.2.0:

!pip install tensorflow_decision_forests -U -qq
import tensorflow as tf
import tensorflow_decision_forests as tfdf
import pandas as pd

# Download the dataset, load it into a pandas dataframe and convert it to TensorFlow format.
!wget -q https://storage.googleapis.com/download.tensorflow.org/data/palmer_penguins/penguins.csv -O /tmp/penguins.csv
dataset_df = pd.read_csv("/tmp/penguins.csv")
train_ds = tfdf.keras.pd_dataframe_to_tf_dataset(dataset_df, label="species")

# Create, train and save the model
model = tfdf.keras.GradientBoostedTreesModel()
model.fit(train_ds)

import pickle

with open("model.pkl", "wb") as model_file:
    pickle.dump(model, model_file)

with open("model.pkl", "rb") as file:
    model = pickle.load(file)

val_ds = tfdf.keras.pd_dataframe_to_tf_dataset(dataset_df.drop(columns=["species"]))

print(model.predict(val_ds))

Best, Michal

rstz commented 1 year ago

Hi Michal,

thank you for the example. Pickling a model is not supported by TF-DF, so proceed at your own risk :) I'll open an internal bug to discuss this further, but the use cases look a bit niche to me (anyone reading it, feel free to express support for pickling via the emojis)

~On the positive side, I was able to get your example to work by calling model.compile() on the model before pickling - without looking into it much further, it seems like this solves the problem :)~ EDIT: This was unfortunately not a solution :(

TF-DF supports multi-class classification, if the model is correctly loaded, this shouldn't cause issues.

Best, Richard

waral commented 1 year ago

Hi Richard,

thanks for your help. Unfortunately, compiling the model before pickling doesn't seem to resolve the issue on my side, still getting the same result (I just added model.compile() right before pickling). In any case, I'll be careful when working with saving/loading models like that. Let me know, if this gets resolved internally, though.

Thanks, Michal

Arnold1 commented 1 year ago

Hi, i also pickle the TF-DF model - same as @waral - would like to increase the priority of that internal ticket...