maths / moodle-qtype_stack

Stack question type for Moodle
GNU General Public License v3.0
140 stars 148 forks source link

Support for non-standard unicode symbols #860

Closed georgekinnear closed 1 year ago

georgekinnear commented 1 year ago

I've had a student report the following error message:

image

The response they had typed (on their phone, which looked to be using a language other than English) was:

(x+4)^2-12

e.g. here the left bracket is U+FF08 FULLWIDTH LEFT PARENTHESIS https://r12a.github.io/uniview/?charlist=%EF%BC%88#title

Could this be quietly replaced with the standard symbol?

(Perhaps more generally there are other unicode equivalents for various symbols that could be handled in such a way - for reference, I think Numbas has an issue along similar lines https://github.com/numbas/Numbas/issues/690)

christianp commented 1 year ago

Ta, @georgekinnear!

aharjula commented 1 year ago

We have this bit of logic that complains about the wrong chars being used. What we need is another list of things that can be silently fixed, as it is pointless to complain if there is no option for using the correct ones on the keyboard of the mobile device, or if the device starts doing typesetting and switches the inputted chars.

In any case, if we start to replace, we must remember to do it only for syntax outside string values and leave the strings alone, as it is always a possibility that string values would need to include these chars. Also, I would only do this for student input and would leave the author-side code as is to avoid having to deal with any complications that might arise on that side if we start to change things silently.

What would be interesting would be to have a general bit of logic that would identify Unicode chars in interesting places and would then check the names of those chars, e.g. if we happen to see a "SUPERSCRIPT LEFT PARENTHESIS" somewhere, we should probably try to identify it as a LEFT PARENTHESIS, simply by using the name. If we would do this through the names, we would not need to keep track of any new spacing or other combinations that might appear in the future. We could have a list of all the named special chars. We could also watch for ligatures, like "LEFT DOUBLE PARENTHESIS", and try to fix those. Someone should go through the Unicode char maps and figure out all those possible replacements and the names that could be used to identify things.

Bjoern-Ge commented 1 year ago

This logic is an interesting point. It is elegant in as it addresses the problem for almost all related charts. Nevertheless, I am assuming that there are typical chars we can focus on.

Maybe we can make one table with chars that are interpreted/understood as – converted into – another “standard” char. This table can then be applied to all student input. Here, we probably need to collect typical problems first that we’ve noticed while using Stack with students.

A second table could be used as a teacher defined maxima array. This might be helpful for special situations. For example, a space could be interpreted as a plus in this array. Then 1 1/2 would be understood as a mixed fraction. Or 2w(x) is understood as 2*sqrt(x).

As my students use an iPad, often times the following char comes along: • U+2022 should be * U+002A

Bjoern-Ge commented 1 year ago

A list of things that students can find on the IPad-Keyboard: — (U+2014)should be – (U+002D) – (U+2013) should be - (U+002D) • U+2022 should be * U+002A

sangwinc commented 1 year ago

This page is relevant to this issue: https://jkorpela.fi/dashes.html

sangwinc commented 1 year ago

Numbas is also having the same issue, so I think it would make sense to share a library/mapping here https://github.com/numbas/Numbas/issues/690

christianp commented 1 year ago

I've started a repository at https://github.com/numbas/unicode-math-normalization, aiming not to be too Numbas-specific. There are some JSON files in the final_data directory that you can use. I'd appreciate your input on it!

You might also want to look at the new test cases in the Numbas JME unit tests.