da-analysis / asac_4_dataanalysis

ASAC 4기 Data Analysis Project
0 stars 1 forks source link

언더샘플링 데이터 실험하기 (LG) #19

Open syleeie2310 opened 3 months ago

syleeie2310 commented 3 months ago

언더샘플링 데이터 실험하기 (LG)

AUC 기준으로 어떤 데이터가 좋을지 결정 필요

syleeie2310 commented 3 months ago

from recommenders.evaluation.spark_evaluation import SparkRankingEvaluation, SparkRatingEvaluation

evaluations = SparkRankingEvaluation( dfs_test, # 테스트 데이터 dfs_pred_final, # 실제 prediction 데이터 col_user=COL_USER, # asin1 col_item=COL_ITEM, # asin2 col_rating=COL_RATING, # co-review cnts col_prediction=COL_PREDICTION, # prob k=10 # k 갯수 )

print( "Precision@k = {}".format(evaluations.precision_at_k()), "Recall@k = {}".format(evaluations.recall_at_k()), "NDCG@k = {}".format(evaluations.ndcg_at_k()), "Mean average precision = {}".format(evaluations.map_at_k()), sep="\n" )

syleeie2310 commented 3 months ago

https://github.com/recommenders-team/recommenders/blob/main/recommenders/evaluation/spark_evaluation.py

syleeie2310 commented 3 months ago

relevancy_method = top_k 면 10개 추천

by_threshold = 3으로 주면 테스트 데이터 알아서 빠짐.

syleeie2310 commented 2 months ago