keras-team / keras

Deep Learning for humans
http://keras.io/
Apache License 2.0
61.11k stars 19.36k forks source link

FeatureSpace multiple output from one input #19697

Open zippeurfou opened 3 weeks ago

zippeurfou commented 3 weeks ago

This is more of an ask to add to a tutorial than a feature request as I believe this might already doable today. When looking at the FeatureSpace tutorials it assumes that you create one output per feature. I am basing this from this tutorial where it seems to be an assumption that from one input you create one output. For example:

feature_space = FeatureSpace(
    features={
        ...
        # Numerical features to normalize and bin
        "age": FeatureSpace.float_discretized(num_bins=4),
        ...
    },
  ...
    ],
    output_mode="concat",
)

In this example there is an assumption that you only create one float_discretized output for the input age. However, in practice (eg. youtube paper) multiple output can be created from one input.

Screenshot 2024-05-09 at 8 05 04 PM

It would be nice to add in the tutorial or somewhere in the doc an example to do so. I have tried to replicate the paper on numerical features using FeatureSpace and I found it difficult to do so without doing the extra transformation in the model itself which I think is not the initial idea of this functionality. Please also note that not all preprocessing requires an adapt method in practice. For example from Are Neural Rankers still Outperformed by Gradient Boosted Decision Trees?, they encourage to do this transformation -> x = log(1 + |x|) * sign(x). In this case, you don't really need an adapt method to implement this. I think it would be beneficial to consider to open it to not only preprocessing class implementation but that is a different topic that I can open a different ticket if it is better to do so.

fchollet commented 3 weeks ago

This does not appear to require any extra features:

  1. Use FeatureSpace to get x.
  2. Call keras.ops.square(x) and keras.ops.sqrt(x) to get your other features.

I found it difficult to do so without doing the extra transformation in the model itself

You can do the above either inside the model or in a data pipeline. Inside the model would look like this:

# Retrieve a dict Keras Input objects
inputs = feature_space.get_inputs()
# Retrieve the corresponding encoded Keras tensors
encoded_features = feature_space.get_encoded_features()
x = encoded_features["x"]
x_sqrt = keras.ops.sqrt(x)
x_square = keras.ops.square(x)
output = ...
model = Model(inputs, output)
zippeurfou commented 3 weeks ago

Thank you @fchollet for the quick answer. The method you mentioned is how I do it already today. I was hoping to be able to do it as part of the FeatureSpace creation (maybe wrongly?) just because it felt cleaner in term of code organization to me and still part of the feature creation to me. It also brings the question of combination within the FeatureSpace (eg. Normalization + x2). So I guess the question is if it would make sense to extend FestureSpace to allows to have more flexibility than having a preprocessing transformation. In practice, if for example you send it a preprocessing that does Normalization and then x2 you would apply the adapt to normalization and then do x2. In pseudo code it could look as follow:

custom_layer = keras.Sequential([keras.layers.Normalization(),keras.layers.Lambda(lambda x: x ** 2)])
feature_space = FeatureSpace(
    features={
        ...
        # Numerical features to normalize and bin
        "age":  FeatureSpace.feature(
            preprocessor=custom_layer, dtype="float", output_mode="float"
        ),
        ...
    },
  ...
    ],
    output_mode="concat",
)

This could be extended to more advanced transformation with "multiple" output from one input. eg.

class DiscretizeAndParallelLambda(layers.Layer):
    def __init__(self, num_bins, **kwargs):
        super(DiscretizeAndParallelLambda, self).__init__(**kwargs)
        self.num_bins = num_bins

    def build(self, input_shape):
        self.discretize = layers.Discretization(self.num_bins, name="discretize")
        self.lambda_square = layers.Lambda(lambda x: tf.math.square(x), name="square")
        self.lambda_sqrt = layers.Lambda(lambda x: tf.math.sqrt(x), name="sqrt")
        self.concat = layers.Concatenate(name="concat")

    def call(self, inputs):
        discretized = self.discretize(inputs)
        squared = self.lambda_square(discretized)
        sqrt = self.lambda_sqrt(discretized)
        return self.concat([squared, sqrt])

    def get_config(self):
        config = super().get_config()
        ...
        return config

feature_space = FeatureSpace(
    features={
        ...
        # Numerical features to normalize and bin
        "age":  FeatureSpace.feature(
            preprocessor=DiscretizeAndParallelLambda(...), dtype="float", output_mode="float"
        ),
        ...
    },
  ...
    ],
    output_mode="concat",
)

In theses example the first layer do require adapt but using the log1p example it does not have to. Then when you call adapt to feature space, "behind the scene" it would look for preprocessing layer and apply it when the layer implement a preprocessing class. Edit: Looking at the source code it might be able to work out of the box as it checks if adapt exist. So in my previous layer I might just be able to do:

    def adapt(self, data):
        self.discretize.adapt(data)

and since it does check if adapt exist before executing it then in the case of log1p then this would work out of the box. So maybe this is just providing example about it?

zippeurfou commented 2 weeks ago

Another more trivial example is for example if you want to transform one feature (eg. age) into:

  1. Discretization that you will send after being discretized as cross with another feature.
  2. Normalization that you won't cross

So in this scenario from one input you created 2 preprocessing features where one is being used as a cross feature. I don't think the current architecture allows you to do so but maybe I am missing something.