tensorflow / decision-forests

A collection of state-of-the-art algorithms for the training, serving and interpretation of Decision Forest models in Keras.
Apache License 2.0
658 stars 109 forks source link

Getting value error for model.save() #11

Closed sibyjackgrove closed 3 years ago

sibyjackgrove commented 3 years ago

I trained a model successfully. I was also able to use model.evaluate,model.summary, and tfdf.model_plotter.plot_model_in_colab(model, tree_idx=0, max_depth=4) But when I tried to save it using: model.save("hypermodels/model")

I am getting the following error:

ValueError: Got non-flat/non-unique argument names for SavedModel signature 'serving_default': more than one argument to '__inference_signature_wrapper_12650' was named 'build_existing_model.geometry_foundation_type_Heated Basement'. Signatures have one Tensor per named input, so to have predictable names Python functions used to generate these signatures should avoid *args and Tensors in nested structures unless unique names are specified for each. Use tf.TensorSpec(..., name=...) to provide a name for a Tensor input.

abeusher commented 3 years ago

I'm experiencing the same error when I make the call to model.save()

abeusher commented 3 years ago

Hey @sibyjackgrove

I'm experiencing the same error when I make the call to model.save()

I found this related issue: https://github.com/tensorflow/tensorflow/issues/44984

After looking at my input data (train and test) I found that one of the columns was missing a header name. It was a column with an observation ID that uniquely identified each item in my input data.

After updating my CSV files by adding the column header 'id' (no quotes in file) to those two columns, the decision-forests library worked correctly.

I recommend you double check your input data and ensure: all of the columns have a name in the header none of the header names have a space or tab embedded in them.

abeusher commented 3 years ago

@googlebot I recommend you update the exception handling in decision-forests to have a more intuitive error message to the engineer who experiences this issue in the future.

sibyjackgrove commented 3 years ago

@abeusher Thanks for the tip. I removed the column names with spaces and tried again. But I am getting the same error. So I removed other special characters like .,, and % from the column names. Now it is saving.

achoum commented 3 years ago

Thanks both for the report and the debugging! :)

It seems the default signature created by the keras model serialization does not support all characters.

Starting with the 0.1.6, TF-DF will raise an explicit error message if the feature contains one of those characters. The error can be turned into a warning for users who write custom export signatures. I'll also look into an alternative solution that does not forbid any character in the feature name.

New error message

train_df = pd.DataFrame({"label":[0,1,0,1],"f 1":[0,1,0,1]})
model = tfdf.keras.RandomForestModel()
model.fit(tfdf.keras.pd_dataframe_to_tf_dataset(train_df, label="label"))

ValueError: One or more feature names are not compatible with the Keras API: The feature name "f 1" contains a space or a tab character. This problem can be solved in one of two ways: (1; Recommended) Update the feature name(s) to be compatible. (2) Disable this error message (fail_on_non_keras_compatible_feature_name=False) and only use part of the compatible Keras API.

sibyjackgrove commented 3 years ago

Thanks for the update. I was able to make model.save work by removing spaces as well as characters like '%', and '.' in the column names.