JPBP22 / engine

0 stars 4 forks source link

SVD Modelling problem #11

Closed HenningGC closed 1 year ago

HenningGC commented 1 year ago

When trying to model with the Singular Value Decomposition, I seem to always get a precision score of 0.

Here's the code;

`import numpy as np

Number of latent factors

k = 100

Perform SVD

U, sigma, Vt = svds(train_sparse, k=k)

Convert sigma into a diagonal matrix

sigma = np.diag(sigma)

Reconstruct the interaction matrix

predicted_interactions = np.dot(np.dot(U, sigma), Vt)

Create an inverse mapping from game indices to game IDs

idx_to_game_id = {idx: game_id for idx, game_id in enumerate(games_dict.keys())} game_id_to_idx = {game_id: idx for idx, game_id in enumerate(games_dict.keys())}

Create a mapping from user IDs to user indices

user_id_to_idx = {user_id: idx for idx, user_id in enumerate(user_dict.keys())}`

`

Evaluation function (Precision@K)

def precision_at_k(predicted_interactions, test_interactions, k): precisions = [] for user_id in user_dict.keys():

Get the user index for the current user ID

    user_idx = user_id_to_idx[user_id]

    # Get the top K predicted game indices for each user
    top_k_indices = np.argsort(predicted_interactions[user_idx])[-k:]

    # Get the game IDs for the top K predicted games using the inverse mapping
    top_k_games = [idx_to_game_id[i] for i in top_k_indices]

    # Count how many of these games are in the test set
    relevant_games = 0
    for game_id in top_k_games:
        game_idx = game_id_to_idx[game_id]
        if test_interactions[user_idx, game_idx] > 0:
            relevant_games += 1

    # Calculate the precision for the current user and store it
    precision = relevant_games / k
    precisions.append(precision)

# Calculate the average precision across all users
avg_precision = np.mean(precisions)
return avg_precision

Evaluate the model using Precision@K

for k in range(1, 21): precision_at_k_value = precision_at_k(predicted_interactions, test_sparse, k) print(f"Precision@{k}: {precision_at_k_value}") `

HenningGC commented 1 year ago

Output: Precision@1: 0.0 Precision@2: 0.0 Precision@3: 0.0 Precision@4: 0.0 Precision@5: 0.0 Precision@6: 0.0 Precision@7: 0.0 Precision@8: 0.0 Precision@9: 0.0 Precision@10: 0.0 Precision@11: 0.0 Precision@12: 0.0 Precision@13: 0.0 Precision@14: 0.0 Precision@15: 0.0 Precision@16: 0.0 Precision@17: 0.0 Precision@18: 0.0 Precision@19: 0.0 Precision@20: 0.0

HenningGC commented 1 year ago

Solved after refactoring the code to the following: `from scipy.sparse import csr_matrix from scipy.sparse.linalg import svds

interactions_sparse = csr_matrix(interactions.values) U, sigma, Vt = svds(interactions_sparse, k=50) sigma = np.diag(sigma) svd_predictions = np.dot(np.dot(U, sigma), Vt) rmse = np.sqrt(mean_squared_error(interactions_sparse.toarray(), svd_predictions)) print(f'RMSE: {rmse}')

def recommend_games(user_id, interactions_matrix, predictions, games_dict, num_recommendations): user_interactions = interactions_matrix.loc[user_id, :] sorted_user_interactions = user_interactions.sort_values(ascending=False) user_predictions = pd.Series(predictions[user_id], index=interactions_matrix.columns) sorted_user_predictions = user_predictions.sort_values(ascending=False) recommendations = sorted_user_predictions.head(num_recommendations).index recommended_games = [games_dict[game_id] for game_id in recommendations] return recommended_games

user_id = 0 num_recommendations = 10 recommended_games = recommend_games(user_id, interactions, svd_predictions, games_dict, num_recommendations) print(f"Recommended games for user {user_id}: {recommended_games}") `