openedx / platform-roadmap

Tracking the maintenance, enhancement, and advancement of the Open edX project.
11 stars 1 forks source link

[Product Pull Request] fix: increase grading rounding precision to avoid incorrect grades #238

Open jmakowski1123 opened 1 year ago

jmakowski1123 commented 1 year ago

For Contributing Author:

This is the Primary Product Ticket for the following community contribution: increase grading rounding precision to avoid incorrect grades

Checklist prior to undergoing Product Review:

The following information is required in order for Product Managers to be able to review your pull request:

Only if necessary:

Related PRs

For Product Manager doing the review:

What criteria should be analyzed from Product to approve a PR?

github-actions[bot] commented 1 year ago

Thanks for your submission, @openedx/open-edx-project-managers will review shortly.

jmakowski1123 commented 1 year ago

Information from the original PR

Enabling the rounding in #16837 has been causing noticeable (up to 1 percentage point) differences between non-rounded subsection grades and a total grade for a course. This increases the grade precision to reduce the negative implications of double rounding.

Jira

OSPR-5819

Sandbox

https://pr27788.sandbox.opencraft.hosting/ Direct link to the progress page (you can log in as staff).

Testing instructions

  1. Import this course in Studio.
  2. Complete two units from there.
  3. Go to the Progress page and see that the score without this PR is 35%. After this change, it should be 34%.

Explanation

This course contains two subsections graded as Homework (weight: 75%):

  1. In the first subsection, users can get 2 out of 3 points (66.67%).
  2. In the second one, users can get 1 out of 4 points (25%).

In the current approach, the grades of subsections are rounded. Therefore, in this case, we're getting (67% + 25%) / 2 / 4 * 3 (/ 4 * 3 is because of the weight (75%)), which gives us 34.5. This is rounded up to 35%.

This PR changes these calculations to (66.67% + 25%) / 2 / 4 * 3, which returns 34.37625, which is then rounded down to 34%.

Deadline

None.

Reviewers

Other information

Private-ref: BB-4210

jmakowski1123 commented 1 year ago

Additional information from the author:

Hi @ProductRyan, thank you for checking. That's correct - the subsection grades calculated by the platform are inaccurate because of the double rounding. Therefore, we want to increase the rounding precision to produce correct results.

jmakowski1123 commented 1 year ago

Pending edX platform wide review as of 2/23/23 @ProductRyan When that platform review is complete, please resume that thread in this ticket. If there are platform-wide specs that need to be taken into consideration, we can widen the scope of the initiative and consider for additional funding.

Agrendalath commented 1 year ago

@jmakowski1123, just checking on the status of this. Is there anything we can do to help moving this forward?

mphilbrick211 commented 1 year ago

@jmakowski1123, just checking on the status of this. Is there anything we can do to help moving this forward?

CC: @ProductRyan @Daniel-hershel

ProductRyan commented 1 year ago

@Agrendalath sorry for the long wait here, but we want to make sure we nail this cross-platform once and for all. The Aurora team is starting to open up some capacity and @Daniel-hershel is working to build out a roadmap now that he's been on board for a few months.

I've given Daniel all the details and he's working to incorporate it into his plans.

ghassanmas commented 1 year ago

MY 2Cent,

This change can probably lead to performance issue, in a nutshell to be comre accurate you need more compute time, thus more cost more relevant to high scale

Below is a script that it shows it would add around 30% time when grades where 100K in a row.

Other un related issue is what would happen if instructor run compute grade by mistake of a certfication that has been already issued... this probably might not be related depening on the platform certifcation policy.

Lastly the pattern it seems to get something in which is controversial, is just to wrap it around a feature flag, which might not be alway the prefect solution but it be comers more releavnt path if the end goal in the frist place is just to get it to be in as quickly as possible

from datetime import datetime
from numpy import around

def do_100000_round(decimals):
    for i in range(100000):
        around(34.37625 / 100 ,decimals)

# without decimal
t_befre_2 = datetime.now()
do_100000_round(decimals=2)
t_after_2 = datetime.now()

t_befre_4 = datetime.now()
do_100000_round(decimals=4)
t_after_4 = datetime.now()
result_2 = t_after_2 - t_befre_2
result_4 = t_after_4 - t_befre_4
print(f"result for 2 round {result_2 }\n")
print(f"result for 4 round {result_4 }\n")
print(f"Increase of run time { (1 -  around(  result_4 / result_2, decimals=4) ) * 100} % ")
Agrendalath commented 1 year ago

@ghassanmas, I ran this locally, and the result was 10.51%. I also tried this with 100000 iterations, and the result was 0.99%. For 10000000 iterations, the result was 0.09%. The percentage you provided means a change of 0.03477s. For a bigger scale (10M rows), the difference is even smaller (0.0249s). Are you sure this is relevant to high scale?

Other un related issue is what would happen if instructor run compute grade by mistake of a certfication that has been already issued... this probably might not be related depening on the platform certifcation policy.

Yes, I already mentioned this in the PR:

This change will not affect existing persistent grades as long as they are not re-generated. Otherwise, this can cause up to a 1 percentage point difference in the results (on a rare occasion, like the one described in the PR).

The current grades generated by Open edX are inaccurate, though. It could be better to call this a fix instead of a feat.

if the end goal in the frist place is just to get it to be in as quickly as possible

Well, this PR (https://github.com/openedx/edx-platform/pull/27788) has been in product review since May 31, 2021...

ghassanmas commented 1 year ago

@ghassanmas, I ran this locally, and the result was 10.51%. I also tried this with 100000 iterations, and the result was 0.99%. For 10000000 iterations, the result was 0.09%. The percentage you provided means a change of 0.03477s. For a bigger scale (10M rows), the difference is even smaller (0.0249s). Are you sure this is relevant to high scale?

I need to give a another look, it also definilty would vary from machine to machine, intel vs ARM...etc, it was just one guess.

Yes, I already mentioned this in the PR:

Yes It was suffienct, I didn't meant to remention it, but rather there might be probably cases where an insturctor need to consider what those in case: where a student had a passing grade and then after this changes they get fail or vis-versa, I speculate there might be buecray dpeending on the program, org author..etc of the a course. Again I am just speculating

mphilbrick211 commented 1 year ago

Hi @Daniel-hershel! Just checking to see if there's any update on this? Thanks!

Daniel-hershel commented 1 year ago

@mphilbrick211 hi! So we definitely recognize the need to address grade rounding in some way, so appreciate this work. For this (or any rounding solution) to work it would need to be implemented holistically system wide, which would include places like:

So for this solution to be accepted from a Product perspective it would have to represent a holistic implementation. The issues with rounding is also on my radar as a potential place the Aurora team will make an investment, and I can keep you posted on those developments as well.

mphilbrick211 commented 1 year ago

Thanks, @Daniel-hershel!

mphilbrick211 commented 1 year ago

Hi @Daniel-hershel - following-up to see if there's any update here?

jmakowski1123 commented 1 year ago

@mphilbrick211 Let's move this PR to draft mode until we can find an owner to drive product specs for a platform-wide solution to rounding issues in grading.

Agrendalath commented 8 months ago

@jmakowski1123, @mphilbrick211, is there something we can do to help move this forward?

mphilbrick211 commented 8 months ago

@Agrendalath thanks for the ping! I'm checking internally with Jenna on your questions.

jmakowski1123 commented 8 months ago

Hi @Agrendalath . Since the implementation of this would need to take into consideration a broader platform perspective, this will need to go through the Product Review process, documented here. Per Daniel's comment (which I agree with), does your team have the bandwidth to take on defining a spec and implementing with the wider scope in mind?

Agrendalath commented 8 months ago

@jmakowski1123, according to this comment, the value calculated in the edx-platform was meant to be the single source of truth regarding these calculations. We can verify this for the whole learner experience and gradebook, but we don't use the credentials and data backends (I thought the Analytics were already deprecated anyway?). Would this approach work here?

jmakowski1123 commented 8 months ago

That sounds right. Axim can pick up the Credentials analysis.

hurtstotouchfire commented 7 months ago

@jmakowski1123 FYI the Credentials IDA should be a dumb consumer of Grades. If I had my way it would be a string and not even a number. We only need it to display the grade on the certificate and we are in no way an authority on grades. There are many Credentials bugs that have come from trying to be too smart about grades. The credentials app within edx-platform likely has more squishy boundaries, but again it is not a good thing. Just wanted to caution about any approaches that involve pushing grade precision knowledge further into the credentialing apps. I believe Ed and Glib and other credentials enthusiasts would agree with this but we can discuss if needed.

Agrendalath commented 5 months ago

@jmakowski1123, we have analyzed the impact of this change on the learner experience and opened https://github.com/openedx/frontend-app-learning/pull/1397 to match the calculated weighted grades of individual assignment types (on the Progress page) with the scores calculated by the LMS.

Before
After

The original PR (https://github.com/openedx/edx-platform/pull/27788) did not require any further changes. The only learner-facing change is that the grades will be displayed with increased precision on the legacy Progress page when a learner hovers over the subsections on the progress graph.

Before
After

Regarding the Gradebook, after this change (in the edx-platform), the subsection scores will have increased precision. For example, instead of 67%, it will be 66.67%. I'm not sure if this is intended, but this line indicates that these scores should indeed be rounded to two decimals. We can change it to hide the decimal points or keep it as-is to make the scores more verbose for the Instructors. What do you think?

Before
After
Agrendalath commented 5 months ago

@jmakowski1123, this is ready for the product review.

cc: @mphilbrick211, @itsjeyd

itsjeyd commented 4 months ago

Hi @jmakowski1123, do we know who will be leading the product review process as coordinator here?

itsjeyd commented 4 months ago

CC @ali-hugo ⬆️

ali-hugo commented 3 months ago

@itsjeyd I don't think this update would require more than one reviewer (which means we wouldn't really need a coordinator). However, I would like to check this in the next Core Product meeting on 30 July.

I'd be happy to be a product reviewer for this. @Agrendalath Is there a sandbox or similar where I can see the change in action?

itsjeyd commented 3 months ago

Sounds good, thanks @ali-hugo!

And just to understand better: I see you already marked this ticket as done. Since the July 30 meeting hasn't happened yet, what does that mean for the status of the product review process?

CC @Agrendalath

ali-hugo commented 3 months ago

I see you already marked this ticket as done.

@itsjeyd Sorry, I have no idea how I did that - it was a mistake! I have reopened the ticket.

Agrendalath commented 3 months ago

@ali-hugo,

Is there a sandbox or similar where I can see the change in action?

There used to be a sandbox for this PR, but it doesn't seem to be working anymore. @gabor-boros, do you know if we have a simple way to prepare one with https://github.com/openedx/edx-platform/pull/27788 and https://github.com/openedx/frontend-app-learning/pull/1397?

In any case, I described all impacted platform areas in this comment.

ali-hugo commented 3 months ago

@Agrendalath @gabor-boros

In any case, I described all impacted platform areas in https://github.com/openedx/platform-roadmap/issues/238#issuecomment-2132456349.

In that case, don't worry about the sandbox. I should have everything I need in that comment. Thanks!

ali-hugo commented 3 months ago

@jmakowski1123 I've reviewed the changes outlined in this comment and everything looks good to me from a product perspective. The changes have been implemented with a broader platform perspective in mind.

However, there is a question that I can't answer, namely:

Regarding the Gradebook, after this change (in the edx-platform), the subsection scores will have increased precision. For example, instead of 67%, it will be 66.67%. I'm not sure if this is intended, but this line indicates that these scores should indeed be rounded to two decimals. We can change it to hide the decimal points or keep it as-is to make the scores more verbose for the Instructors. What do you think?

Should the two decimal points be shown on the Gradebook, or should they be hidden?

P.s. I just noticed that this ticket's status has been set to "abandoned". Any idea why that would be?

ali-hugo commented 3 months ago

@itsjeyd A quick update on this:

@itsjeyd I don't think this update would require more than one reviewer (which means we wouldn't really need a coordinator). However, I would like to check this in the next Core Product meeting on 30 July.

Unfortunately, there was no time to ask about this in the Core Product Meeting yesterday as there were two presentations scheduled. That said, I'm pretty sure that my review, along with @jmakowski1123's answers to my questions here, will be sufficient.

I know this isn't really your "domain", but do you have any idea why this ticket's status may have been set to "abandoned"?

itsjeyd commented 3 months ago

@ali-hugo Sorry for the delayed reply. Scrolling up on this ticket, I found this:

sometimes-a-bot-is-to-blame-and-not-the-gardener

So it looks like there is some automation in place that updates statuses on both the Contributions board and the Open edX Roadmap when someone closes an issue. Which makes sense, but unfortunately the automation doesn't seem to make any additional updates when the issue gets reopened:

and-its-lazy-too

I went ahead and reverted the status manually on the Contributions board. Couldn't do the same for the roadmap as I don't seem to have permission to change ticket statuses there.

CC @jmakowski1123

--

Unfortunately, there was no time to ask about this in the Core Product Meeting yesterday as there were two presentations scheduled. That said, I'm pretty sure that my review, along with @jmakowski1123's answers to my questions https://github.com/openedx/platform-roadmap/issues/238#issuecomment-2257781320, will be sufficient.

Sounds good, thanks for the update!

jmakowski1123 commented 3 months ago

Thanks for the before and after data and for driving the check on this. What was the rationale for including the more precise grade for learners when they hover over the progress bar? I'm questioning whether that should be presented to learners, especially if the additional decimals are hidden from course teams in the gradebook.

Agrendalath commented 3 months ago

@jmakowski1123,

What was the rationale for including the more precise grade for learners when they hover over the progress bar?

As you can see in the PR, we didn't explicitly change anything on the Progress page. Both the Progress and Gradebook pages were designed to show more precise grades:

Regarding the Gradebook, after this change (in the edx-platform), the subsection scores will have increased precision. For example, instead of 67%, it will be 66.67%. I'm not sure if this is intended, but this line indicates that these scores should indeed be rounded to two decimals.

We only increased the precision of the values passed to these pages.

Agrendalath commented 2 months ago

@jmakowski1123, is there anything we should discuss before moving this forward?

cc: @ali-hugo, @itsjeyd

jmakowski1123 commented 1 month ago

Got it. I think this is ok to move forward, thanks