Development idea(s) related to common code for inputs.

aharjula commented 1 year ago

We have heard of the need to have repeated code in PRTs to evaluate the same details about the same inputs, and it is a real issue even though the cases where it really matters are somewhat extreme. Basically, issues #763 and #853 are related to this.

Traditionally STACKs inputs have not had much in the way of CAS-evaluated parameters beyond the "model answer"/"teachers answer" field, but in the future, this may change when we eventually get that input2 system that I have mentioned so many times in the past years. But how much will things change, and what sorts of things should change? This issue aims to collect some of the possible ideas and related limitations in one place for discussion.

First of all, here are some fundamental limitations:

It is difficult for any logic related to an input to reference any other inputs for two reasons.
- Firstly, because inputs are processed in the order of their definition in the question model, whatever processing would be a part of some inputs logic could only see things related to the inputs defined before it.
- Secondly, an input might not have a value, or it might be an invalid value, so even if it were present above in the list, it might not be usable in logic that would need it.
There are essentially two phases where we could have additional CAS-evaluated logic.
- The first is CAS-based validation of the input, which happens after the surface-level security and syntax validation has been done on the PHP side. Currently, this focuses on type checking and other predefined logic that explores the shape and type of the answer. In this phase, we are primarily limited by the order of those inputs and the results of the validity checks of the previous inputs and may not reference them freely. Thus, building validity in relation to other inputs is hard on the CAS side.
- The second would be the phase where all the validated values would be pushed into the next CAS session where the PRTs are being executed. Currently, we have nothing that would act at this phase. In this phase, we could see all the raw values of all the inputs and could execute logic that would build something from them, but even with such logic, we would need to deal with the possibility of invalid or missing inputs.

Given that, I still suggest the following additions to the input model. Not the current but the future model:

Validation logic this would be a keyval field that would contain a singular block() statement. This would be evaluated if all the other validity tests of the input pass and could then trigger a validation failure with a custom validation error message. This block would be evaluated in a context where the question variables are present so it could be parametric. The context would also contain this input's and only input's value and its string-version in variables with hard-coded common names, e.g. input_value and input_string. The reason why we would not want to use the inputs name as the name of those variables is that we want to be able to copy-paste or include the logic between inputs, and tuning the names could be a source of a large number of errors and annoyance. The block would evaluate to the boolean value true if the received values are valid according to its tests, and if the values would not be valid, it would evaluate to an castext() expression that would be used as the error message displayed to the student, which would allow injection of the input values to the error message.

Postprocessing logic (we'll call these post-processing variables) this would also be a keyval field of the input. However, in this case, it could contain arbitrary logic that would be evaluated after the valid input values have been injected into the context and before the PRTs are being evaluated. As such, it could reference any other inputs regardless of their order and could construct anything one wants as long as one ensures that the referenced inputs are present by checking whether they have been bound or through other means. Naturally, this bit of logic will only get evaluated for those inputs that are valid themselves, so as long as it only references the input that the logic has been attached to, all will simply work. Note that the ability to reference other inputs means that any variables defined here cannot be used to identify which inputs a PRT requires. Note that if we only execute these for valid inputs, then the users will always need to either bind the PRT to the input or check for unbound variables. We might also include a check for these to restrict the variables in use so that the same names cannot be used in multiple inputs postprocessing, but this might stop certain common conditions and counters from being built.

Both of these would mean that we would get new forbidden keys in our questions, and they would need to be tracked, but that is something we can do.

timhunt commented 1 year ago

We already have feedback variables. Surely that covers your 'Postprocessing logic' at least?

aharjula commented 1 year ago

Tim, while feedback variables do indeed do that, the problem mentioned in the referenced issues arises from the situation of having many PRTs referencing the same inputs and doing the same post-processing in every PRT. If you build a follow-through grading style situation with multiple inputs and every PRT has to separately figure out if the first input fulfils the same conditions, that can lead to quite a lot of repetition. And I trust that you like many programmers do not like repetition and might want to change the model so that repetition could be avoided.

Of course, we could say that it is wrong to have that many PRTs or inputs and all repetition that might follow from such a style of authoring is entirely the author's problem, and no need features are needed as the current ones work. But I do have to wonder if it might make more sense to consider the role of feedback variables as something directly connected to that particular PRT as opposed to these new "post-processing variables" being something directly tied to that particular input. Having said that, I might consider these "post-processing variables" as a way of documenting the input, both in the form of logic (which often is the best form of documentation) that does common interpretation of the input by constructing separate variables (describing style, correctness or other details) from it, and as a field that could receive actual /* comments */.

timhunt commented 1 year ago

Sorry, I was forgetting feedback variables were per-PRT. Yes, an extra overall set of feedabck variables that are evaluated before the per-PRT ones seems like a good addidion.

aharjula commented 1 year ago

For that validation logic, one example could be testing the length of a list (this came up in #856). Basically, if you ask for a list to be inputted and use the same-type validation rule, you will get a list, but if you would like that list to be of a certain length, there is no means for ensuring that during validation. With custom validation logic, you could combine the existing same-type rule to ensure that you get a list for your custom validation and then react to that list's length in any way you wish.

Now someone could ask for a standard test for the length but do we really need to have such, and how many standard options would such a test have? After all, this is a rare need, and when one needs this, one could also want to customise the validation messages. So having to write something like this would probably be simpler than working with "standard"-options and messages:

block([l],
  l: length(input_value),
  /* Should we also test for uniqueness here? */
  if l < 3 then 
    castext("Your list of numbers has too few numbers, only {#l#}. Try to find at least three that fit the conditions.") 
  else 
    true
);

aharjula commented 1 year ago

So if the validator portion is now coming in the form of an extra-option that declares a name of a single argument function that gets called with simp:false and the input value as its argument, we might need some tools for those that use common validators and would like to combine multiple of those. Basically, it would be nice to be able to compose validators like this:

/* Obviously this is not a sane example. */
my_list_length_a(ex) := if length(ex) > 3 then "Your input is too long." else true;
my_list_length_b(ex) := if length(ex) < 3 then "Your input is too short." else true;
my_integer_list(ex):= if all_listp(integerp, ex) then true else "Only include integers in your list.";

combo_validator(ex):= stack_validator_combinator(ex, [my_list_length_a,my_list_length_b,my_integer_list]);

So we need something like this:

/**
 * A convenience function for combining validators, to be used with the input validator system.
 * Executes all functions received and produces a combined output.
 *
 * @param[expression] the input value to be validated.
 * @param[list of identifers] the names of the functions to be combined.
 * @return[string or CASText] the result of those validations.
 */
stack_validator_combinator(ex, validators):=block([%_tmp, %_val],
 %_tmp:[],
 for %_val in validators do (
  %_tmp: append(%_tmp,[%_val(ex)])
 ),
 /* Remove all valid results.*/
 %_tmp: delete("", delete(true, %_tmp)),
 if %_tmp = [] then return(""),
 /* Then concatenate CASText2 segments. Add spaces between multiple failures. */
 /* `rest` currently requires that simp, for negative arguments. */
 %_tmp: ev(lreduce(castext_concat, rest(join(%_tmp, makelist(" ",length(%_tmp))),-1)),simp)
);

Obviously, the name of that function should be something saner.

sangwinc commented 1 year ago

That's a really good idea, especially as it might lead to better re-use and testing of validators.

aharjula commented 1 year ago

Also, Chris, when I said that core validators using [[lang]]-blocks might be a problem due to the sets of defined languages being mismatched. I completely forgot that for whatever is in our core, we should not use the [[lang]]-block at all. We would use localised string mapping to lang/en/qtype_stack.php (sorry, this said lang.php, which was a mistake.) using the [[commonstring]]-block and thus leave all the localisation for the surrounding systems language-packs with all the fallthrough logic those provide.

sangwinc commented 1 year ago

Ok Matti, I'm struggling to follow this - can you give me an example?!

aharjula commented 1 year ago

So for example, we have this here:

https://github.com/maths/moodle-qtype_stack/blob/390aa65db79c7a73995d14063ccc59b70f638e99/lang/en/qtype_stack.php#L790

We could have a validator like this (this has not been tested even for syntax):

spurious_val(ex):=block([%_tmp,simp],
 simp:false,
 %_tmp: setify(lisofvars(ex)),
 simp:true,
 %_tmp: setdifference(%_tmp, {x,y,z}),
 if cardinality(%tmp) = 0 then "",
 castext("[[commonstring key='ValidateVarsSpurious' m0='stack_disp_comma_separate_math_castext(listify(%_tmp))'/]]")
);

There stack_disp_comma_separate_math_castext is the one from #917, and basically, we are just plugging in a comma-separated list of {@...@} injections of variable names into whatever language string has that key ValidateVarsSpurious.

We really do not want to write a massive list of localisation cases in our own code. Instead, we leave those for language strings to handle so that users can simply modify the values through the language packs without having to touch the code.

sangwinc commented 1 year ago

Argh! Yes, thanks Matti that's super helpful.

sangwinc commented 1 year ago

Matti, I'm stuck with two issues!

Error trapping. We'd like to return some validity text to students. E.g. if the 2nd test throws an error we'd still like students to get the first message "your answer should be list", not just the general validator errror. I think the WIP commit above does this now.
I can't get the castext language string to work in this WIP commit.

Can you advise?

aharjula commented 1 year ago

Well, you cannot write castext() into our current Maxima side code like that. This is one of the reasons I want to "compile" our Maxima side code. If you want to go for this at this phase, you must use the cli/castextcompiler.php to turn those castext(...) portions into actual code and place that code into those functions. In the future, one could actually write those functions in the same style as the question author can, as for question side code we apply compiler that turns those "macro" functions to code.

For example where you have this:
 castext("[[commonstring key='studentValidation_invalidAnswer' /]]")

You would extract that string and push it into the compiler:
  php cli/castextcompiler.php --string="[[commonstring key='studentValidation_invalidAnswer' /]]"

And it would return the matching expression:

[
  "%root", 
  [
    "%cs", 
    "studentValidation_invalidAnswer"
  ]
]

Which you can then condense down to:
["%root",["%cs","studentValidation_invalidAnswer"]]
And that would replace the whole castext() call as that is what it is supposed to return.

If injections are in play then the expression will be more complex.

Because no one is going to do by hand we really want that new parser thing to start processing our code so that we can keep that readable castext-syntax for our documented code and swap in the slightly more listy version for the executed code.

Also if you are going to mention stack_disp_comma_separate_math_castext in the documentation maybe you should include that in our codebase, you should be able to find it from issue #917. The problem is that it also has that castext() call that needs to be translated to be functional.

sangwinc commented 1 year ago

Thanks Matti, that's super helpful. I'm making some progress here.

I'd forgotten stack_disp_comma_separate_math_castext was proposed in issue #917.

sangwinc commented 1 year ago

Is it possible for users to define validators, using the language strings which have arguments? If so, can you provide an example? The static strings are fine now (see latest commit), but I can't work out how to add arguments either in question variables (with castext()) or in the core code...

aharjula commented 1 year ago

Only using language strings present in STACKs language pack strings...

So this from above works by injecting with {@...@} injection (without wrapping like \(...\)) to a named placeholder m0.

spurious_val(ex):=block([%_tmp,simp],
 simp:false,
 %_tmp: listofvars(ex),
 simp:true,
 %_tmp: setdifference(setify(%_tmp), {x,y,z}),
 if cardinality(%_tmp) = 0 then "",
 castext("[[commonstring key='ValidateVarsSpurious' m0='listify(%_tmp)'/]]")
);

One could also reference that m0like this raw_m0='whatever', and then it would do {#...#} value injections. Note that the language string must use named placeholders, won't work for a singular unnamed one. Also, raw_ is not the only prefix known to this block you can also affect simplification with certain prefixes.

sangwinc commented 1 year ago

Ok, I've worked out what I was doing wrong! I was assuming that in m0='X' the X should be a string but in fact this needs to be a Maxima object. No wonder I kept getting errors..... Thank you!

maths / moodle-qtype_stack

Development idea(s) related to common code for inputs. #870