hexiangnan / neural_collaborative_filtering

Neural Collaborative Filtering
Apache License 2.0
1.8k stars 655 forks source link

One hot encode vs embedding #37

Closed tmacraft closed 5 years ago

tmacraft commented 5 years ago

I noticed in the paper you mentioned the input is one hot encoded user / item vectors connect to a fully connected layer to get user / item latent vector. But in your Keras code you did embedding directly on user / item to get their latent vector. Can you please explain to us what is the difference here? Thanks in advance!

cakirmuha commented 5 years ago

If you implement Embedding function in Keras like this: https://github.com/hexiangnan/neural_collaborative_filtering/blob/4aab159e81c44b062c091bdaed0ab54ac632371f/MLP.py#L66 you will see input_dim, output_dim, input_length. For example, you have totally 5 users, your current userID is 3, and your Embedding layer has 2 neurons, so input_dim=5, output_dim=2, and input_length=1(since we are only using userID as feature, and input shape(None, 1)). As a result, in your input layer [0 1 0 0 0] will be input for Embedding layer, and output will be like this [0.1231 -0.12414](Embedding function includes one-hot encoding with input_dim and embedding operation)

tmacraft commented 5 years ago

Thanks for clarifying! Closing this issue now.