Can tf-ranking handle the CTR problems with high dimensional sparse data?

tensorflow / ranking

Learning to Rank in TensorFlow

Apache License 2.0

2.74k stars 474 forks source link

Yes. One of the advantages of neural networks is that they can handle high dimensional sparse features. This is done by learning a dense representation/embedding for a given sparse feature.

TF-Ranking uses Feature Columns (see tf.feature_column) to represent features. Look at this unittest for using a combination of embedding columns and categorical columns to handle sparse data. Categorical columns take in a vocabulary as an input, or alternatively can use hash buckets to create an internal vocabulary.

For very high dimensional sparse data, it is common to prune down the vocabulary to the top N frequently occurring feature values.

tensorflow / ranking

Can tf-ranking handle the CTR problems with high dimensional sparse data? #8