[Q] Handling Boolean Features

facebookresearch / dlrm

An implementation of a deep learning recommendation model (DLRM)

MIT License

3.75k stars 837 forks source link

[Q] Handling Boolean Features #333

Closed avnish-wynk closed 1 year ago

avnish-wynk commented 1 year ago

How does DLRM handle boolean features or low-cardinality features?

Do we embed them to the same dimensionality as we embed all other sparse features? but won't that be redundant as we'll be embedding features with low cardinality ex. 5 into a much larger space say 16-dim?

I am assuming we need to embed all features to the same dimensionality to calculate the pairwise interactions. How should features with cardinality varying from 2 to 1M be handled?

mnaumovfb commented 1 year ago

All sparse features are handled in the same way and embedded into the same dimension by default.

This does not have to be the case, though. This and related questions are discussed in the following references: A. Ginart, et al. Mixed Dimension Embeddings with Application to Memory-Efficient Recommendation Systems M. Naumov, On the Dimensionality of Embeddings for Sparse Features and Data

avnish-wynk commented 1 year ago

Will it not be redundant to encode boolean features into higher dimensions? Do we do it just to be able to calculate the pairwise feature interactions?

Can we even calculate pairwise feature interactions (dot products) on Mixed dimension embeddings?

mnaumovfb commented 1 year ago

There could be multiple techniques and reasons for encoding boolean features. First, you can combine them and transform them into n-grams. Also, by encoding them you are giving them a certain meaning in the abstract embedding space and are later able to interact them.

Notice that you can always pass the mixed dimension features through an appropriately sized matrix multiplication and then interact the results using a dot product.

avnish-wynk commented 1 year ago

Thanks for the information @mnaumovfb. Closing the issue.