Closed yuujinleee closed 3 years ago
"F”
Must be enclosed with double quotation mark
공백은 " "
""
둘다 허용
For README, please include the followings in this milestone:
A short description of how your recommendation algorithm works. In case you are using an existing algorithm, describe what algorithm you used and how it works.
A description of how to run your program. An example of java command line will be good.
A description of supported inputs and expected outputs when inputs are not supported.
Roles of each member (i.e. who did what?)
junit system exit 관련 외부 소스 사용 출처 쓰기
Error code 넘버링은 이제 제외
지금 내 브랜치 로컬에서 수정 중이라 혹 궁금할까봐 중간중간 여기에 복붙해놓을게
ㅇㅅㅇ 💻
CSE364 Group11
Table of Contents
About the Project
This project is for the capstone project in SW Engineering (CSE364) at UNIST, developed by Yeongjun Kwak, Sanghun Lee and Yujin Lee. This will be further inplemented as a movie recommendation system. Currently under the development···
Built with
Repository Structure
Getting Started
Prerequisites
git clone
, the user should be registered as a Contributo or Collaborator of this git project. (Unless you have the OAuth access token.)Installation
dockerfile
andrun.sh
in the same directory.. run.sh
Milestone 1
Explanation of the Algorithm
On Milestone 1, the code calculates and returns the average rating from ratings data for specified occupation and genre.
Running the Test
Continued from Installation steps.
Examples
When valid inputs are passed, the output message will look like this :
Supported Inputs
Rules for the Inputs
Combination of Multiple Genres as an Input
"Action|Adventure"
, the movies that fall into both Action and Adventure category are used for the calculation.Error Codes
Possible errors thrown by invalid user input.
Table 1 Invalid input errors
InputEmptyError
InputNumError
InputNumError
InputInvalidError
InputInvalidError
Table 2 Invalid input warning
InputInvalidWarning
Table 3 No data exist error
NoDBError
1
.Examples for the Error Codes
Error code : 1~3
Error code: 4
Error code: 5
Error code: 6
Error code: 7
"War|Crime" Academic
: Example for No Movie data matching the entered genre (combination)."Action|Animation|Children's|Sci-Fi|Thriller|War" retired
: Example for No available Rating data for the genre-occupation input pair.About Junit Test
The Junit test (and regarding csv test resources) for Milestone 1 has been moved to
scripts/
. For more information, please refer to this issue #20Contribution by Area
Yujin Lee
👑 Yeongjun Kwak
Exception Handling
👑 Sanghun Lee, Yeongjun Kwak
Unit Test Building
Yujin Lee
Final Reviewer
Sanghun Lee, Yeongjun Kwak
👑 Yujin Lee, Sanghun Lee
Milestone 2
Explanation of the Algorithm
On Milestone 2, the code returns the recommendation of Top 10 movies for specified gender, age, occupation or genre(s). First, to recommend 'relevant' movies, the code makes use of 1) Bayesian Estimate, which is used to calculate Top 250 movies by IMDB as well, when calculating and comparing the ratings of movies. Also, to set the 'similar' users (in case there aren't enough ratings that match gender, age and occupation), we have set 2) Priority rule for including similar users.
1) Bayesian Estimate
Bayesian Estimate is an estimator that can help minimizing the risk of including that minimizes the posterior expected value of a loss function. By making use of Bayesian Estimate, the algorithm calculates Weighted Rating (
W
) and arranges movies withW
. In this way, the movies with very few ratings or below-average ratings will have comparably light weight.In Detail, the calculation of Weighted Rating(
W
) is implemented byclassified_table
) by selecting the object with more thanm
votes frommovie_rating_table
.The original reference for Baysian Estimate can be found here. However, in this project, the estimation method and variables has been set differently to adjust the differences in requirements.
W = (vR+mC)/(v+m)
v
andR
v
andR
are obtained by making use ofMovie_data_node
.m
: Minimum Number of Ratingsm
is obtained by thePercentile
function. The function returns the number of votes of the movie that corresponds to (1-p
)*100 % .p
is differently set for number of movies byset_p
function ; so that the validity of the weighted rating can be enhanced.e.g If there are 1000 movies with average ratings, the function will return the number of ratings of 200th-highest movie.
C
: Average rating across all the moviesC
is obtained by thetotal_average_rating
function.2) Priority rule for including similar users
The algorithm firstly makes the ArrayList(
valid_user_list
) of users that matches the inputs fromusers.dat
. And then, this list is used to extract the ratings information and movie data fromratings.dat
andmovies.dat
.However, when there aren't sufficient amount of movie candidates to be ranked (On here, it is set to
100 movies
) for specified user data, the 'similar' users will also be added tovalid_user_list
in order of precedence (priority) by functionmake_intersection_list_macro
, until the number of movie candidates gets bigger than100
.The similar users with priority are the users with :
The priority has set as above to give a more weighting on Occupation, and less on Gender and Age Range.
Running the Test
Continued from Installation steps.
Examples
When valid inputs are passed, the output message will look like this :
Testing with 3 inputs
Testing with 4 inputs
Supported Inputs
Common Rules for the inputs
""
).Gender
F
", "f
", "M
", "m
".""
Age
""
Occupation
""
Genre (When testing with 4 inputs)
However, the pipeline(
|
) here is uesd to link OR conditions, not AND.Adventure|Animation
includes all the movies that are categorized as Adventure OR Animation to candidates for Top 10 movies.Testing with 3 inputs
Testing with 4 inputs
Error Codes
Possible errors thrown by invalid input.
Table 1 Invalid input errors
InputNumError
InputInvalidError
InputInvalidError
InputInvalidError
InputInvalidError
InputInvalidError
InputEmptyError
""
is passed for the genre input.Table 2 No data exist error
NoDBError
NoDBError
1
.Examples for the Error Codes
여 기 수 정 ! !! ! !
Error code : 1~3
Junit 추가 !! ! ! ! !
Contribution by Area
👑 Yeongjun Kwak
Exception Handling
👑 Sanghun Lee, Yeongjun Kwak
Unit Test Building
Yujin Lee
Yujin Lee
👑 Yujin Lee
Milestone 3 (Upcoming)
Milestone 4 (Upcoming)
Team Members
License & Acknowledgements