Closed krostas1983 closed 4 years ago
Thanks for suggesting this. I had another request to manage the cache recently, and as STACK gets larger we need to consider this kind of thing. For now, you need to clear the cache by hand temporarily.
It isn't possible, or would not be easy to "Turn off caching for questions with more than X variations." The cache layer is very close indeed to the CAS connection, and at that point we have lost all information about where the request has come from. E.g. we might not know if this is for question variables, to create CAS text or even if this is a student validation attempt or an answer test. We could provide this information, but it won't be easy to do.
Automatically clear unused cached items after Y days is a good idea.
Pre-generate and cache all variations of this question. This should be done already if you "run all question tests". So no need to change the code to do this.
Thank you for your reply. I can see how the cache being close to the CAS connection prevents such an implementation. If that's the way it is, so be it; even more so thank you for your thorough explanation.
I'm glad that one of my suggestions seems to be well received. Looking forward to seeing it implemented (eventually).
As far as pre-generation goes, I'm having trouble pre-populating the cache with the method you mentioned. Assuming you're referring to the script /stack/bulktest.php , we encounter the problem of "0 passes and 0 failures" - I'm assuming this is because no tests have been configured for these questions. No cache population is taking place.
When doing this on a per-question basis via /stack/questiontestrun.php , only one variant is ever deployed because again, we quickly get an error message saying "Too many repeated existing question notes were generated."
As far as I can tell, both are problems under the teacher's control, while I am stuck with an administrative problem (performance). Furthermore, I don't know whether it is intended behaviour that repeating question notes prevent pre-emptive cache population.
Yes, "0 passes and 0 failures" means you won't run any tests.
Please at least check the correct answer (Teacher's answer) is being marked as correct by STACK. That takes all of 30 seconds to add as a test case, and in my experience it traps a lot of bad random versions....
"Too many repeated existing question notes were generated."
That sounds odd, are you sure there are valid question notes in those questions? Basically, the question note must output the values of the random parameters otherwise it does not work. That message should otherwise only happen when you have large number of potential variants and you have already collected most of them.
In any case the whole caching thing only really works with deployed variants combined to tests as the regeneration using the bulktester only works if you have tests. If one has no deployed variants then the number of different seeds to cache just is so large that no one will get a cached variant when initiating a quiz but will have one when continuing.
There are very few things that could be done for the general case:
In the context I support STACK use we do not typically use deployed seeds so this is an issue that matters to us too. Although it is not an issue that is causing us trouble currently.
I would like to add two points for consideration in this discussion that could be helpful, which I will present in two replies for ease of response.
I have been authoring statistics questions that generate versions from random variables involving at times between 50 and 100 randomly generated floating points, which leads to an incredibly large number of variants (example of such a question here). I understand that writing question notes carefully would mean that variants would not be generated for each data set, reducing the number of floating points considered, but once we start using floating random points for the mean or other defining parameters this will lead to still very large numbers even being careful with the question note. This could potentially have an effect on the caching of variants, but I don't know enough about how the system works in order to evaluate it fully.
The second point is that we are using question notes and deployment of variants as a checking tool for some questions (with between 20-100 variants) and it can be rather time consuming to generate all possible variants in these cases. In fact, I often find myself getting the "Too many repeated existing question notes were generated." message repeatedly until enough (normally not all) variants are generated. In this situation it would be highly beneficial to have a way to generate all possible variants, as suggested in Krostas' point 3.
For those considering the generation of all possible variants and suffering from "Too many repeated existing question notes were generated." you need only consider the probabilities in play.
Basically, how likely it is to pick a new distinct variant when picking N items at random if we already have found M distinct variants? Basically, one can formulate an estimate for picking e.g. N known variants when M have been found without gaining any new ones and keep on picking until it reaches high enough probability for ones liking.
If my probability knowledge is not entirely wrong then something like this should work:
prob_{no more to find} = 1 - (N_found/(N_found+1)^N_sample
That is necessary as it is practically impossible to form an inverse function to generate seed numbers from known expected parameter combinations and as I previously said it might even be difficult to know how many variants there truly are.
Anyway using such an estimate it could be possible to search the seeds in sequence until the estimate says that it is unlikely to find any more. Such logic might be useful compared to the current which just tries picking as many as it can as long as there are enough new ones in the set, but which will stop trying if none are found.
Thanks everyone for your valuable input. I've now taken a few questions and provided notes for them (simply the random variables, comma separated) - this allowed me to manually implement variants in the test overview of a question.
I'm going to make sure that we brief our teachers to provide sensible question notes and test cases.
Having a number of random variables in a STACK question easily leads to a potential number of variations that basically guarantee each new test attempt will have nothing to retrieve from the cache. This leads to the following problems:
I propose the following two changes to the STACK admin UI: 1) Turn off caching for questions with more than X variations. 2) Automatically clear unused cached items after Y days.
In addition (or alternatively) to this, when administrating a STACK question, it should be possible to: 3) Pre-generate and cache all variations of this question.
Especially the last point immensely benefits loading times and thus circumvents a problem the moodle core generates: Timed tests start their countdown when activated, regardless of loading time needed. Of course, pre-generation of all variants would put a heavy load on the CAS, maybe requiring some sort of scheduling or prioritizing. I can also only reasonably imagine this running smoothly on Maxima Pools...
So maybe another one to two admin options in the plugin UI: 3.1) Only allow pre-generation of results during <_weekday and time schedule_> or <_time schedule_> 3.2) Only pre-generate results while less than X Maxima requests are pending.
I hope this was not too extensive. I think STACK performance would greatly benefit from implementing this and am eager to discuss any problems you might see with this issue.