maths / moodle-qtype_stack

Stack question type for Moodle
GNU General Public License v3.0
140 stars 148 forks source link

Question test score rounding #1049

Closed timlowe closed 10 months ago

timlowe commented 10 months ago

As a result of giving part marks for entries in three 3x3 matrices (numerical methods question), I have a question that can generate scores such as 0.86702127659574. ( ;-o )

When the Question test saves this, it rounds it to 0.867 (understandably).

However in the Question test the actual score is not rounded, so I have Score: 0.86702127659574. Expected Score: 0.867

which are not considered equal and hence the test can never pass. Tim

sangwinc commented 10 months ago

Looking at the code, we have a note saying PRT scores are rounded to 3dp. Perhaps we changed this in the PRTs at some point when we made the major changes in STACK 4, because PRTs now return floats.

In any case, I will fix this by consistently rounding both using the same function.

Sorry about this Tim, and thanks for reporting it!

aharjula commented 10 months ago

Well, the code of the new PRTs still should ensure three digits (and capping), also for variable scores... https://github.com/maths/moodle-qtype_stack/blob/006344ae63cfb5a512952fe2cf2709da7135fc69/stack/prt.class.php#L484-L485

Naturally, if we for whatever reason run the whole CAS with more digits then those floats do get outputted in full form and you get those extra digits. Maybe truncate again here: https://github.com/maths/moodle-qtype_stack/blob/006344ae63cfb5a512952fe2cf2709da7135fc69/stack/prt.evaluatable.class.php#L124-L125

Whatever scaling happens once that PRT result comes out is of course another matter. And what the exact float value in the question tests is again another matter, all comparing of floats is fun as always.

christianp commented 10 months ago

A while ago I changed Numbas to represent scores as fractions, found by continued fraction approximation to the float that comes out of the marking algorithm. This ensures that you don't accumulate rounding error when adding up several scores - we had a problem where a question with seven parts, each worth 1/7 of a mark, didn't add up to exactly 1 when you got them all correct.

aharjula commented 10 months ago

Well in this case it is simply a representation issue. In the new PRT system you can use arbitrary expressions for the score and for example declare a function that generates the score based on the distance the student's answer is from something. So simply having fractions is not a real option for that.

For the eternal "sum of floats does not match expected 1" issue, the current logic in PRTs is to round the sum of raw arbitrary precision parts at the third digit level and thus get reasonable results, which does seem to work relatively well.

What we need here is either, to change the CAS float representation accuracy back to normal (my theory here is that the fp-format option has been changed earlier in the question and affects things here) at the point where the PRT result gets serialised, or do later processing to return the outputted value back to the expected form.

timhunt commented 10 months ago

Moodle's approach is to store scores internally to 7 decimal places. I suggest STACK does the same. (Moodle gradebook and overall Quiz scores are stored to 5DP, so that seemed like enough extra precision.) Roudning to 3DP, if we are doing that, is a bug.

There is one other complication here. historically, there were lots of fractional grades typed into STACK questions like 0.33 or 0.67, meaning 0.3333333 or 0.6666667. So, STACK converts the 'badly rounded' numbers to the full precition. I wonder; if Tim's example is close enough to hit that? (If so, it is the first time we hit that in about 10 years. It has been a good heuristic until now.)

sangwinc commented 10 months ago

Thanks everyone, this is all really helpful. Yes, continued fractions is the elegant (and correct) approach. I'm not proposing we change how to calculate or store the PRT results, but only perform uniform rounding within the testing mechanism itself. WIP commit to follow.