Overshooting of Predicted Rating by Model at Lower Codeforces Rating Due to 3 Degree Polynomial

re4lvanshsingh / Codeforces_Codechef_Converter

Converts the rating in between two popular competitive programming platforms- Codeforces and Codechef

1 stars 4 forks source link

Overshooting of Predicted Rating by Model at Lower Codeforces Rating Due to 3 Degree Polynomial #1

Closed HavokSahil closed 7 months ago

HavokSahil commented 7 months ago

Description

I have observed that the model predictions for Codechef ratings tend to overshoot, especially at lower Codeforces ratings, due to the use of a 3-degree polynomial. This behavior is impacting the accuracy of the predictions.

Expected Behavior

The model should however be consistent with lower as well as higher rating input within threshold value.

HavokSahil commented 7 months ago

I am interested in addressing this Issue. After carefully analyzing the problem, I believe that utilizing a higher-order polynomial and incorporating regularization would be an effective approach to handle overfitting.

Additionally, I propose augmenting the dataset by incorporating more data through web scraping. This, in turn, would contribute to the robustness of the model and improve its performance.

@re4lvanshsingh I am enthusiastic about contributing to the project and would appreciate it if you could assign me this particular issue under Codepeak'23.

re4lvanshsingh commented 7 months ago

Sure thing. You've got some good insights and I was kind of planning to change the underlying ML Model as well.

I have assigned this issue to you. Here's what I want:

1) Using Web-Scraping tools like BeautifulSoup on python:

Extract the username, past 5 to 10 ratings, past 5 to 10 contest ranks, number of contests participated in and the number of accepted solutions (solved problems basically).

Prepare a comma separated values (csv) or excel file of the same.

2) Train various ML models on the divided dataset (70:30 ratio for training and testing) like Polynomial Regression, Neural Networks etc. and employ the highest performing model.

3) Using matplotlib plot the points for visualisation.

Right now, I have assigned the task of Web-Scraping for you as a medium task.

re4lvanshsingh commented 7 months ago

@HavokSahil please comment on the web-scraping issue to get it assigned. I will close this thread afterwards

HavokSahil commented 7 months ago

@re4lvanshsingh I have sent the pull request. I have added new folder for Web-Scrapper. #6