Inaccurate ratings of Professors with legacy floating point ratings.

Nobelz / RateMyProfessorAPI

Python web scraper to get professor ratings from ratemyprofessor.com website.

Apache License 2.0

39 stars 11 forks source link

The professor.get_ratings() method returns inaccurate rounded down ratings for professors who have floating point value reviews.

Details: These floating point reviews get wonky translated values when queried through the API. In the picture below the actual review for ECT584 is a 2.5, but in the API it gets a value of 1. Quite a breaking issue when trying to obtain the rating distribution.

what code produces:

[('ALLCLASES', [3]), ('CS521', [3]), ('CSC200', [5]), ('CSC210', [4]), ('CSC478', [3, 2, 2]), ('CSC480', [2, 4]), ('CSC575', [1, 5]), ('DS575', [4]), ('DSC478', [1, 4]), ('ECT584', [5, 1, 5, 5, 5, 5]), ('HON207', [3]), ('IT130', [3]), ('LSP110', [3, 4])] Bamshad Mobasher {'5 stars': 7, '4 stars': 5, '3 stars': 6, '2 stars': 3, '1 star': 3}

I even went priitive with the code to make sure I wasnt being dumb b/c my dictionary implementation produced incorrect ratings so i swapped to regualr if statement which still produced the wrong distribution, then I looked at the API code only to find that the rating is of type int and its rounding things differntly than the website does.

prof = rmp.Professor(582550)
        course_ratings = [(course.name, [rating.rating for rating in prof.get_ratings(course.name)]) for course in prof.courses]
        print(course_ratings)

        course_ratings = [[rating.rating for rating in prof.get_ratings(course.name)] for course in prof.courses]
        for ratings in course_ratings:
            for rating in ratings:
                if rating == 1:
                    one_count += 1
                elif rating == 2:
                    two_count += 1
                elif rating == 3:
                    three_count += 1
                elif rating == 4:
                    four_count += 1
                elif rating == 5:
                    five_count += 1

        print(prof.name, {"5 stars": five_count, "4 stars": four_count, "3 stars": three_count, "2 stars": two_count, "1 star": one_count})

def _get_distro(self, professor_id: int): url = "https://www.ratemyprofessors.com/graphql" headers = { "Content-Type": "application/json", "Referer": f"https://www.ratemyprofessors.com/ShowRatings.jsp?tid={professor_id}", "Authorization": "lol", "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36", } query = """ query GetTeacherDetails($id: ID!) { node(id: $id) { ... on Teacher { firstName lastName numRatings ratingsDistribution { r1 r2 r3 r4 r5 total } } } } """ encoded_id = base64.b64encode(f"Teacher-{professor_id}".encode()).decode() variables = {'id': encoded_id} response = requests.post(url, json={'query': query, 'variables': variables}, headers=headers) if response.status_code != 200: print(f"Failed to fetch data: {response.status_code}, {response.text}") return try: data = response.json() print(json.dumps(data, indent=4)) except json.JSONDecodeError: print("Failed to decode JSON from response:", response.text)

Nobelz / RateMyProfessorAPI

Inaccurate ratings of Professors with legacy floating point ratings. #19