numbas / numbas-lti-provider

An LTI tool provider to run Numbas exams
https://docs.numbas.org.uk/lti/en/latest/
Other
11 stars 12 forks source link

New attempts always showing as "Not Attempted" #225

Open oscarsiles opened 2 years ago

oscarsiles commented 2 years ago

On a fully updated instance (on the V3_stable branch), student attempt "completion status" is remaining as "not attempted" even if they submit some answers. I think it may be related to some huey tasks getting "stuck" or failing silently (nothing in the logs that I could find) as trying to generate reports also get stuck "In Progress".

Running on Ubuntu 20.04 LTS using the normal instructions, and restarting the processes with systemctl will allow some of the attempts to become "Complete"/"Incomplete" and any pending reports also tend to complete after this. However, after some time everything becomes "stuck" again. Using Postgres 12 and Redis is on version 5.0.7 in case that helps.

christianp commented 2 years ago

It sounds like the huey process is getting stuck somehow. If this is the suspend data diffing again, I'll scream!

Is the numbas_lti_huey process using a lot of CPU or memory? Can you see any long-running database queries?

oscarsiles commented 2 years ago

So there aren't any processes using lots of CPU/RAM, and in terms of long-running database queries there's 8 essentially identical ones to the following (all "idle"):

pid | user | query_start | query_time | query | state | wait_event_type | wait_event | 80956 | postgres | 2022-03-24 08:46:33.518422+00 | 5 days 01:36:10.396329 | SELECT "numbas_lti_lticonsumer"."id", "numbas_lti_lticonsumer"."url", "numbas_lti_lticonsumer"."key", "numbas_lti_lticonsumer"."secret", "numbas_lti_lticonsumer"."deleted", "numbas_lti_lticonsumer"."identifier_field" FROM "numbas_lti_lticonsumer" WHERE "numbas_lti_lticonsumer"."id" = 1 LIMIT 21 | idle | Client | ClientRead

No blocking queries come up either when checking for those. Looking at huey logs it just shows periodic tasks being queued up:

[2022-03-29 11:22:33,354] INFO:huey.consumer.Scheduler:Scheduler:Enqueueing periodic task numbas_lti.tasks.diff_suspend_data: 80e2aae2-f8c0-4298-9b3e-288cd33528c2. 2022-03-29 11:22:33,354 INFO /opt/numbas_lti_python/lib/python3.8/site-packages/huey/consumer.py: Enqueueing periodic task numbas_lti.tasks.diff_suspend_data: 80e2aae2-f8c0-4298-9b3e-288cd33528c2. [2022-03-29 11:22:33,354] INFO:huey.consumer.Scheduler:Scheduler:Enqueueing periodic task numbas_lti.tasks.delete_old_reports: 91d3745c-6675-4e09-83bc-faba4ccb053d. 2022-03-29 11:22:33,354 INFO /opt/numbas_lti_python/lib/python3.8/site-packages/huey/consumer.py: Enqueueing periodic task numbas_lti.tasks.delete_old_reports: 91d3745c-6675-4e09-83bc-faba4ccb053d. [2022-03-29 11:23:33,353] INFO:huey.consumer.Scheduler:Scheduler:Enqueueing periodic task numbas_lti.tasks.diff_suspend_data: cd565f7d-2e1b-4a47-87a9-0c3f96305e7c. 2022-03-29 11:23:33,353 INFO /opt/numbas_lti_python/lib/python3.8/site-packages/huey/consumer.py: Enqueueing periodic task numbas_lti.tasks.diff_suspend_data: cd565f7d-2e1b-4a47-87a9-0c3f96305e7c.

oscarsiles commented 2 years ago

I've had a look at the huey logs more carefully, and it is enqueueing a lot of diff tasks again... After restarting all the processes tasks such as "resource_attempts_csv_report" seem to get through OK, but at some point everything gets "stuck". I know it took about a month to get through the historic diffs last time, so I'm not sure why it may be going through them all again (unless it's just new ones)?

oscarsiles commented 2 years ago

Not sure if it's worth closing this issue or not, but after updating the server and rebooting, it appears to be behaving itself again, with tasks such as generating reports not getting stuck anymore...

Will keep an eye out on it for a bit longer and reopen if needed but it may just have been something else interfering.

oscarsiles commented 2 years ago

It has unfortunately started happening again, and I believe it is linked to student activity in completing exams.