tensorflow / decision-forests

A collection of state-of-the-art algorithms for the training, serving and interpretation of Decision Forest models in Keras.
Apache License 2.0
663 stars 110 forks source link

INVALID_ARGUMENT: No defined default loss for this combination of label type and task #100

Closed AlirezaSadeghi closed 2 years ago

AlirezaSadeghi commented 2 years ago

I'm trying to use GradientBoostedTreesModel in a TFX pipeline, the code is roughly as follows:

model = tfdf.keras.GradientBoostedTreesModel(
        task=tfdf.keras.Task.CLASSIFICATION,
        num_trees=200,
        max_depth=6,
        verbose=True,
        hyperparameter_template="better_default",
        name="classifier",
    )
model.compile(metrics=[tf.keras.metrics.AUC(), "accuracy"])
model.fit(_input_fn(fn_args.train_files, fn_args.schema_path))

This unfortunately gives me an INVALID_ARGUMENT: No defined default loss for this combination of label type and task exception and fails the model training.

Definition of _input_fn is as follows:

def _input_fn(...):
        tf.data.TFRecordDataset(
            tf.data.Dataset.list_files(files), compression_type="GZIP"
        )
        .batch(1024)
        .map(
            lambda batch: tf.io.parse_example(batch, specs),
            num_parallel_calls=tf.data.AUTOTUNE,
        )
        .map(lambda batch: (batch, batch.pop(FeatureManager.LABEL_KEY)))
        .cache()
        .prefetch(tf.data.AUTOTUNE)

Which basically parses the schema into feature specs, parses the batch of TF-examples and finally maps them to a tuple of (Dict[feature_name, Tensor], Tensor), results is like this:

<PrefetchDataset 
 element_spec=(
   {'feature1': TensorSpec(shape=(None, 1), dtype=tf.float32, name=None), 'feature2': ...}, 
   TensorSpec(shape=(None, 1), dtype=tf.int64, name=None)
  )
>

Labels can be 0 or 1 and the task is a binary classification task.

Any idea what I might be doing wrong here?

Mac OS Monterey, tfdv 0.2.4, python 3.8, tfx 1.7

Cheril311 commented 2 years ago

@AlirezaSadeghi can you specify your label type?

AlirezaSadeghi commented 2 years ago

@Cheril311 If I'm understanding you correctly, I've already done it in the text, it's the 2nd entry in the PrefetchDataset tuple (namely TensorSpec(shape=(None, 1), dtype=tf.int64, name=None)).

It's an integer with values either 0 or 1, but as we're reading it in batches, it's of type (None, 1).

So the dataset that's being passed to model.fit is a tuple of (Dict {feature name -> Tensor(None, 1)}, Label Tensor(None, 1))

Did I answer your question? If not please elaborate if possible.

Cheril184 commented 2 years ago

@AlirezaSadeghi my bad

achoum commented 2 years ago

Hi AlirezaSadeghi,

If the loss argument of the Gradient boosted tree is not specified, it is selected automatically from the label type, label values and task. The error you reported indicates that there is no loss matching your label.

Looking at your example, a likely situation is that your int64 label only contains zeros. Can you check it?

Alternatively, you can specify the loss to be the "BINOMIAL_LOG_LIKELIHOOD" i.e. binary classification loss.

On my side, I'll improve the error message for this particular situation.

AlirezaSadeghi commented 2 years ago

Hi AlirezaSadeghi,

If the loss argument of the Gradient boosted tree is not specified, it is selected automatically from the label type, label values and task. The error you reported indicates that there is no loss matching your label.

Looking at your example, a likely situation is that your int64 label only contains zeros. Can you check it?

Alternatively, you can specify the loss to be the "BINOMIAL_LOG_LIKELIHOOD" i.e. binary classification loss.

On my side, I'll improve the error message for this particular situation.

Hi @achoum ,

Yup your assumption is actually right, I'm just testing the pipeline and running the model on a part of the training set, which includes all zeros for starters. Didn't know that might become an issue.

I'll try with BINOMIAL_LOG_LIKELIHOOD and get back to you.

AlirezaSadeghi commented 2 years ago

Okay doing that, it tells me this:

INVALID_ARGUMENT: Binomial log likelihood loss is only compatible with a BINARY classification task

It's somehow assuming the task is not "binary classification"?

AlirezaSadeghi commented 2 years ago

@achoum just an fyi, have you seen my last comment? Wondering if you've got any further insights.

rstz commented 2 years ago

If your task is not a binary classification task, you can try setting the loss to MULTINOMIAL_LOG_LIKELIHOOD

AlirezaSadeghi commented 2 years ago

My task "is" binary classification, and the labels are all 0s, don't know how it's assuming the task is not "binary classification". (as I've already mentioned before)

rstz commented 2 years ago

Oh, apologies, I overlooked that part in your first message

AlirezaSadeghi commented 2 years ago

@achoum No new updates/insights on this? 😔

achoum commented 2 years ago

If all your labels are all 0, the framework detects that this is not a binary classification and fails. If you want to test binary classification, can you create a synthetic dataset with both 0 and 1?

While for unit testing, training on dataset where all the labels have the value could make sense, this error/failure helps to catch error in datasets.