Open tmozgach opened 6 years ago
BiRNN consists of forward and backward RNN structure (GRU cell). In the forward RNN, the input sequence is sorted in ascending order by timestamps for each user, and the model calculates a sequence of forward hidden states. The backward RNN takes the input sequence in reverse order, resulting in a sequence of backward hidden states. To compute the final prediction, we average the output from RNNsin both direction and then apply linear transformation. We recommend the N-movies that have the highest values in the output layer. Learning Rate 0.001
class RNN(nn.Module): def init(self, num_items, hidden_size, n_layers):
super(RNN, self).__init__()
self.hidden_size = hidden_size
self.n_layers = n_layers
self.num_items = num_items
self.embed = nn.Embedding(num_items, hidden_size)
self.gru = nn.GRU(hidden_size, hidden_size, n_layers, bidirectional=True)
self.fc = nn.Linear(hidden_size * 2, num_items)
self.init_weights()
def forward(self, x, hidden):
seq_len = len(x)
embedded = self.embed(x).view(seq_len, 1, -1)
# Forward propagate RNN
out, hidden = self.gru(embedded, hidden)
out = out.contiguous().view(out.size(0)*out.size(1), out.size(2))
out = self.fc(out)
return out, hidden
Random Randomly recommend 10 movies for each user
KNN With Cosine Similarity First compute the Term Frequency-Inverse Document Frequency (which is frequency of a word occurring in a movie, down-weighted by the number of movies in which it occurs) for each movie. We then compute the Cosine Similarity score between all pairs of movies based on the TF-IDF Score. Take the last watched movie of a user in training set, recommend 10 new movies based on the last watched movie
Collaborative Filtering (rating based) Compute a similarity score between all pairs of users in the dataset, based on each user's ratings. For a user U. we then top k most similar users to U, and U's rating for movie M will be a weighted combination of all the K similar users' ratings for movie M.
Takes in the training matrix where each row i contains all the ratings from user i (some may be blank due to train/test split), and output a matrix where row i contains all ratings for user i, without any blank ratings (all movie ratings predicted).
Predicted 10 unwatched movies for each user based on top 10 ratings
Predicted 10 unwatched movies for each user based on top 10 ratings
This part of the project tries to extract features of movies based on the genres and titles while using Convolutional Neural Networks for Sentence Classification to process the titles.
We train a simple CNN with one layer of convolution on top of pre-trained word vectors obtained from an unsupervised neural language model for sentence-level classification tasks.