Overgreedy unit-detection?

mkraska commented 2 years ago

Hello,

If units have been declared, they are written upright in math display. Yet, this should not apply to variable names with indices, these can't be units, even more that these symbols are escaped by a prime. Issue persists in STACK version 4.4.

{@d_L=2*r1*LE@}, {@d_W=2*r2*LE@}, {@d_Z=2*r3*LE@}, <br>
\(\ell_1=\){@l1*LE@}, \(\ell_2=\){@l2*LE@},<br>
{@'F_a=stackunits(F_a,N)@}, {@'F_r=stackunits(F_r,N)@}, {@'mu=mu@}.</p>

quiz-T-B1-MB-TM1-TM1 12 03 Lagerzapfen-20220527-1750.zip

For now I handle this by writing the names outside the {@@} markup.

aharjula commented 2 years ago

Problem is that if you use units at all then all the SI units will get texput()-rules applied to them and they will be laid out as units no matter what they are supposed to be. You might try overriding the rules for Farads if you are adventurous enough. Of course, you can use it to tune the presentation in any way you wish.

Could be that texput is again more powerful due to some changes in the behaviour of STACK in later 4.3 releases, it might even apply to validation messages or maybe not, maybe we could make it apply?

So drop this in your question-variables and see what happens, of course, no need to do calligraphy:

texput(F, "{\\mathcal{F}}");

mkraska commented 2 years ago

I guess this is a bug in texput(). It should not apply to A_x if it is defined for A.

Here is the result of the experiment:

Question variables:

stack_unit_si_declare(true);
texput(A, "{\\mathcal{A}}");

Question text:

{@[A, A_x, Ax]@}

sangwinc commented 2 years ago

This is not a bug! This took me hours of very careful coding to make sure that STACK (but not core Maxima) takes an atom A_B, applies the TeX command to A and B separately and concatenates the result using subscripts. A lot of people asked for me to change the behaviour of TeX on expressions like V_alpha from {\it V\_alpha} to V_{\alpha}. This is certainly "a STACK thing" and not a bug in texput().

Test cases are here: https://github.com/maths/moodle-qtype_stack/blob/master/tests/fixtures/subscriptsfixtures.class.php

The units code also uses texput() to define the output of units. https://github.com/maths/moodle-qtype_stack/blob/master/stack/maxima/stackunits.mac#L72

Does Matti's suggestion of defining a new texput() work? If so, I'll document this. I know question authors will want to use letter names of units not relevant to a particular question as variable names.

Note your last example Ax is because this is a single atom (i.e. the name Ax).

stebac commented 2 years ago

Actually it would be great if the display of letters which can be units or variables would depend on how or where you use them. It is quite possible that within the same question there are the two inputs with model answers m/V and 1.0*kg/m^3, where m is a variable name in one input and a unit in the other one. Same situation for W (work and Watt), s (distance and second) ... For the correct display of multiple students inputs the texput() suggestion does not work.

Could letters only be handled and displayed as units if they are explicitely declared as such? Either by using stackunits() in the question variables or by the relevant input type?

sangwinc commented 2 years ago

Ok, yes I agree we need finer control over the display. Thanks for raising this issue. I'm testing a fix.

mkraska commented 2 years ago

OK, until now I assumed that A, Ax and A_x were maxima atoms. As I understand now, within STACK the underscore is treated as an operator, at least for display purposes. I must have missed that in the docs.

Then, there is no chance to get proper display of a variable with name F_x and a unit with name F in {@@} context within the same question.

Either I don't use texput, then both are displayed in roman because F is a unit (as in the topic starter post)
or both are displayed in italic if I use texput(F, "{\\it F}") as in the comment above
I tried texput(F_x, "{\\it F_x}") but that didn't work.

The workaround is to put occurances of F_x outside {@@}, e.g. \(F_a=\){@stackunits(F_a,N)@}

aharjula commented 2 years ago

I am guessing that the problem is that those "hours of very careful coding" that Chris did affect things during the output phase and override whatever texput-rules might apply to those things before that phase, i.e. they break the rules for expressions with underscores. If that logic could check if there is a special rule in play and avoid doing its thing in that situation then things could work better.

But that would still mean that the rule for a given identifier would simply be a function of that identifier. If we wanted to have fine-grained detection and tuning of identifiers and their roles we would need to be able to look at the question and inputs as a whole and decide the rules at a higher level (e.g. during question compilation). As texput works in the way it does this would still mean that the same identifier could not have multiple different presentation styles, well we could maybe override identifiers inside wrappers like stackunits, but out in the wild outside those wrappers, they would always have the same presentation rule.

I would propose that instead of trying to handle the rendering of underscores with stack_disp_sub_script that overrides things during output we would instead just use texput for everything, and if we want a particular identifier to have subscript handling we would generate the relevant texput-rule at the PHP side where we know the identifiers and their uses and can control their behaviour in a more complex way. Eventually, this could tie into various other features that deal with the presentation, e.g., the annotations that I have been mentioning every now and then.

sangwinc commented 2 years ago

The support for texput is relatively recent. Can you just please confirm which version of STACK you are using?

aharjula commented 2 years ago

Chris,

The problem is that no matter how one creates a texput-rule for something like F_x it won't matter as this here splits the identifier and handles the parts separately and the actual F_x rule cannot be matched after that split:

https://github.com/maths/moodle-qtype_stack/blob/ec4580a6b84ec681c04cedbd9b763e926f1ead05/stack/maxima/stackmaxima.mac#L706-L708

Basically, that forces a particular underscore behaviour and there is no way for avoiding or overriding that.

Texput has always been there and it has always handled the typical identifier presentation issues, the whole unit system relies on it after all. But it only applies to identifiers that match, and if something breaks those up it cannot work.

sangwinc commented 2 years ago

I've discussed this issue in the "Atoms and subscripts" section here: https://github.com/maths/moodle-qtype_stack/blob/iss803/doc/en/CAS/Maxima.md

(Sorry the TeX doesn't display well on github....)

aharjula commented 2 years ago

Indeed. The problem is that people knowing how texput works will be surprised by that, to them it is not natural to assume that:

texput(A, "{\\mathcal A}");
texput(B, "\\diamond");

Would affect in any way the display of A_B, but that is what we have and it even makes sense. The point where the thing becomes problematic is when you now must use trickery if you want to apply different styling if a given subscript is attached to a particular symbol, i.e. defining a rule for a specific combination like A_B just does not work, one needs to do something else to get that and that something else might not be easy, especially for input validation (in input validation you may not easily replace the identifiers and override the rules by overriding the identifiers).

Overall it would be much simpler if there would not be any parsing or splitting happening and the display logic would only act on the values as they are and the rules defined for them in advance.

sangwinc commented 2 years ago

I think the current code does make sense Matti, and the examples A_B (should be a subscript but is not in Maxima) and V_alpha (Typeset Greek alpha) are the compelling cases supporting the current splitting. The new check (viz https://github.com/maths/moodle-qtype_stack/commit/fd5be249a318905f0cdee5e35cca49a6a00e0b28#diff-95c8c40908d5a130bfb8269090527521d431eff68db90703c49ee2658256f371R735 ) was really needed, and could rightly be considered a bug. (Thanks for reporting it.)

It would be "much simpler if there would not be any parsing or splitting happening and the display logic would only act on the values as they are and the rules defined for them in advance" until you want to change the default display, e.g. with units! For example you might really want \mathcal{F} with subscripts attached.

sangwinc commented 2 years ago

Can anyone think of a compelling case for having an (advanced) option which will turn off the current splitting over _ in the display code? We can add an option in easily enough just here as well

https://github.com/maths/moodle-qtype_stack/commit/fd5be249a318905f0cdee5e35cca49a6a00e0b28#diff-95c8c40908d5a130bfb8269090527521d431eff68db90703c49ee2658256f371R735

if stringp(get_texword(ex)) or dont_split_over_underscores then return(ex)

Why would we add such an option?

mkraska commented 2 years ago

I am not sure what the side effects of such an option would be. Would it still keep A_x and F_omega alive?

In an ideal world, I'd like to have the system respecting that anything with a subscript can't be a unit and must not be modified by setting stack_unit_si_declare.

This should not require an advanced option.

Also, respecting existing texputs for suppressing the subscript split would not help with the validation of arbitrary user input. My topic starter question was for question text formatting but the issue is also with validation.

I just wonder if multicharacter base atoms and subscripts should be written in roman by default? But that would be a different issue.

texput(F_res, "F_{\roman res}")

sangwinc commented 2 years ago

Thanks for explaining all of this. To display units in Roman we have used Maxima's texput which is global in effect. What you are asking for would require a substantial re-engineering of the code. I won't have time to undertake this in the summer of 2022 because the changes needed for support of Moodle 4.0 are more substantial. Support for Moodle 4 affects everyone, and so has to take priority. I'm going to merge these changes back into the "dev" branch as I think they represent a substantial improvement, and putting them in the next release is helpful. But I'll keep this issue open.

mkraska commented 6 months ago

I again stumbled over the issue. My memory of it had faded away :(

This happens when something used as subscript in a name is a variable with a value.

sangwinc commented 6 months ago

Yes, this was by design. It should be possible to only use question variables with names which the students don't type in. I'm sorry, but I don't think we are going to backtrack on this issue.

mkraska commented 6 months ago

I don't see how I can safely avoid the problem. No problem if that doesn't get fixed, I am just reporting the issue.

The students can input names like Fbd. So even restricting myself to multicharacter question variables would not be safe.
In any case, the error message is misleading. I can't expect the student to know that is considered as an operation with the operands F and h. So the student will complain that the message is by no reasonable way related to his input.

maths / moodle-qtype_stack

Overgreedy unit-detection? #803