Pairwise Comparison - Top k

jacobmunson / RecommenderSystems

This repository is intended to house efforts from a Fall 2019 independent study focused on Similarity Computation and Clustering Structure for Recommender Systems.

0 stars 0 forks source link

Pairwise Comparison - Top k #11

Closed jacobmunson closed 4 years ago

jacobmunson commented 5 years ago

Upon recommendation, instead of finding all pairwise comparisons and carrying them around, maybe instead find all pairwise comparisons (by user), use similarity of choice, select top k (highest value in evaluation of interest), and just move those around.

New: For 100k dataset and k = 15, 610users * 15sim/user = 9150 similarities carried around. Old: For 100k dataset, 610users, 164,054 similarities to be computed.

Definitely time test as similarity measure will take time.

jacobmunson commented 5 years ago

Edit: simply do a prediction on each line item in the 80/20 split

jacobmunson commented 5 years ago

Prediction on each line of an 80/20 split is up and running - it's still brutally slow in R on the 1M dataset. Something like 1.4 hours (with 3 similarity measures computed "at once") for the 100k dataset.

jacobmunson commented 5 years ago

Runtimes for 100k are not about 57 minutes (considerable speed increase). Running on several variants of similarity measures.