Closed WillKoehrsen closed 5 years ago
I guess we could try that too.
The intuition here is that you want to assign a vector to movie (the embedding) and we do this by using that embedding to predict something (the links). Movies with similar links would get a similar vector and so that's how the embedding should work.
Concatenating would probably create a better model to predict the link, but it might not create as good an embedding.
In notebook 4.2 Build a Recommender System the model is trained using
mse
as the loss function, treating the problem as regression with positive labels 1 and negative labels -1. Would labeling the negative examples 0 and training with abinary_crossentropy
loss function (treating the problem as classification) be a valid approach?Also, do we have to merge the "link" and "movie" embeddings using a dot product (
Dot
layer)? Could we instead concatenate the embeddings, so the shape would be[None, 100]
and then add an additional fully connected layer to make a prediction? Something likeI understand that we're trying to create embeddings for movies so that "similar" movies have closer embeddings in terms of cosine distance. Embeddings make sense, but I guess my question is: are there different ways to train the network to produce the embeddings?