Closed ndunand closed 4 months ago
Hi and thanks for your bug report!
This sounds weird ... The quiz archive worker clearly authenticates successfully at the Moodle web service API. However, there seems to be some sort of problem with rendering the transfered attempt HTML. Maybe the quiz archive worker is redirected somewhere by some sort of additional JavaScript on your site?
Thanks!
Hi @ngandrass
Thanks for the suggestion, I didn't think of it.
Opening the HTML file I had a hunch and figured the extra JS etc might be due to my custom theme. And indeed, changing the default theme to Boost solves the issue.
Now, how could I possible use my custom theme for everyone, and Boost only for the quiz_archiver_serviceaccount ?
Is needed I can DM you the HTML output.
Hi @ndunand
having the full HTML can help, but more or less only the snippet that causes the actual redirection is interesting for me. You can DM it to me if this is fine for you.
To get a better understanding of the impact of this: Is the theme you are using a custom one that is tailored to you or is it publicly available?
Meanwhile, I was thinking about how to handle such situations properly. I guess we have three main options:
(1.) can be a bad idea since it could theoretically alter the way questions and answers are displayed. This could result in archived PDFs that do not resemble the actual student response anymore.
(2.) should work without any problem. However, it would be hard to maintain and would require administrators to have a deep understanding of how the HTML/JS on their sites work and how they need to alter it in order to fix such problems. Possible but not preferred IMHO.
(3.) would be my go-to option once I get a code snippet from you, I can test this with. I'd try to introduce a new option to the worker service to explicitly disable the redirection following behavior. This should probably already be enough, since the actual attempt data seems to be there inside the HTML files, if I understood you correctly.
TL;DR: Please provide me with at least the code snippet that causes the redirection so I can start to implement a workaround for this case :)
So, testing this further, here is the outcome:
Would there be a way to:
&theme=boost
to its requests.if (navigator.userAgent.includes("HeadlessChrome"))
and it works!Hi @ngandrass
Sorry I was writing my last post and only saw your message after posting.
I DM'd you the file by mail.
Thanks a lot for sending me the HTML files.
To me it looks like it's not particular an issue with your theme. It seems that the theme contains ajax requests to endpoints that require authentication. In your case that is an access to user preferences. This triggers the servicerequireslogin
error inside the Moodle core ajax logic, which ultimately results in a redirect to /login/index.php
. See: https://github.com/moodle/moodle/blob/5aef789ac61369ad6641ddd8710477a771176da2/lib/amd/src/ajax.js#L92
I'll try to create a local test scenario to reproduce this problem and create a setting for the worker service to optionally ignore redirections requested by javascript.
Apart from that I think it would be a good idea to set the user agent of the quiz archiver to a more explicit value. Something like MoodleQuizArchiver/1.3.8 HeadlessChrome/116.0.1938.81)
. This would allow site administrators to apply exactly such fixes as described above.
I see; I probably don't fully understand how the headless browser works, as my understanding was that it runs as the quiz_archiver_serviceaccount β i.e. legged in ... so why is it being redirected to /login ?
In any case, your approach sounds good to me: ignoring JS redirect requests should solve the issue.
As for the user agent question, indeed having something more explicit would be better.
I was able to reproduce the behavior locally and found a potential solution. Sadly, I don't have the time to release a proper test version today but I'll do so tomorrow :smiley:
I see; I probably don't fully understand how the headless browser works, as my understanding was that it runs as the quiz_archiver_serviceaccount β i.e. legged in ... so why is it being redirected to /login ?
The quiz archive worker uses the service account to authenticate solely with the Moodle web service API. The latter then returns a HTML tree of the attempt. During rendering of the attempt, no Moodle session exists though. This is done for security reasons.
This way we have fine grained control over which information the service account / worker is able to access while keeping user rights as restrictive as possible. We can manage access on a per archive request basis and instantly revoke access, after the job completed. Moreover, each web service token is only valid for accessing data that is associated with the respective archive job. If we use the service account to establish a fully fledged Moodle session and access quiz data without the web service API instead, we would end up with a situation where we would need to assign global data access rights to the service account in order to be able to access all quiz data. We would not be able to revoke access when no archive job is running nor limit the scope for data access effectively.
In other words, as describe in the README:
A unique webservice access token is generated for every archive job. Each token has a limited validity and is invalidated either after job completion or after a specified timeout. This process requires a dedicated webservice user to be created. A single job webservice token can only be used for the specific quiz that is associated with the job to restrict queryable data to the required minimum.
Great for already having a plan for a solution in the works π
Thanks for taking the time to explain. Indeed I hadn't understood this fully when reading the docs. This is a much more fine-grained and security-conscious approach that what I had in mind. This will be invaluable when making the case for using this feature.
I pushed a development release that contains the fix. You can find it on the develop
branch of the service worker repository [1]. I also built a new Docker image for this version, tagged as ngandrass/moodle-quiz-archive-worker:1.3.10-dev-202407041337
[2].
You should be able to simply pull the new image listed above and replace the existing archive worker with it. If this fixes your problem I can prepare an upstream release :)
You know that it's working, if your reports do not display the login page anymore (D'oh!). However, the quiz archive worker now should also throw a warning message if a redirect to the login page is detected:
[2024-07-04 13:30:19,612] | WARNING | archiveworker.quiz_archive_job::<c72c4ddc-39f8-11ef-ae07-0242c0a88005> | Prevented belated redirection to: https://yourmoodle.tld/login/index.php
Thanks!
[1]: https://github.com/ngandrass/moodle-quiz-archive-worker/tree/develop [2]: https://hub.docker.com/layers/ngandrass/moodle-quiz-archive-worker/1.3.10-dev-202407041337/images/sha256-98872db0978d6ab144a3972d4b2f42678cbd2ad2f16d0749d06852d26ecfd81c?context=explore
I just tried the new image version. It does display the Prevented belated redirection
line, but the resulting PDF is empty
attempt-2034-teststudent1_2024-07-04-14-44-21.pdf
I tried to include the HTML files in the report, and these look fine, as they did previously.
Setting the worker to debug mode shows this:
[2024-07-04 12:48:21,270] | DEBUG | archiveworker.quiz_archive_job::<b13a561c-3a03-11ef-844a-0242ac160002> | Skipping HTML DOM saving of quiz attempt 2034
[2024-07-04 12:48:21,742] | DEBUG | archiveworker.quiz_archive_job::<b13a561c-3a03-11ef-844a-0242ac160002> | Playwright DOM content loaded
[2024-07-04 12:48:22,177] | WARNING | archiveworker.quiz_archive_job::<b13a561c-3a03-11ef-844a-0242ac160002> | Prevented belated redirection to: https://mymoodle/login/index.php
[2024-07-04 12:48:22,183] | DEBUG | archiveworker.quiz_archive_job::<b13a561c-3a03-11ef-844a-0242ac160002> | Playwright request failed: https://mymoodle/login/index.php
[2024-07-04 12:48:22,208] | DEBUG | archiveworker.quiz_archive_job::<b13a561c-3a03-11ef-844a-0242ac160002> | Playwright DOM content loaded
[2024-07-04 12:48:22,216] | DEBUG | archiveworker.quiz_archive_job::<b13a561c-3a03-11ef-844a-0242ac160002> | Injecting JS to wait for page rendering ...
[2024-07-04 12:48:22,223] | DEBUG | archiveworker.quiz_archive_job::<b13a561c-3a03-11ef-844a-0242ac160002> | Waiting for ready signal: x-quiz-archiver-page-ready-for-export
[2024-07-04 12:48:23,225] | DEBUG | archiveworker.quiz_archive_job::<b13a561c-3a03-11ef-844a-0242ac160002> | Playwright console message: x-quiz-archiver-mathjax-not-found
[2024-07-04 12:48:23,225] | DEBUG | archiveworker.quiz_archive_job::<b13a561c-3a03-11ef-844a-0242ac160002> | Playwright console message: x-quiz-archiver-page-ready-for-export
[2024-07-04 12:48:23,227] | DEBUG | archiveworker.quiz_archive_job::<b13a561c-3a03-11ef-844a-0242ac160002> | Received signal: x-quiz-archiver-page-ready-for-export
[2024-07-04 12:48:23,241] | INFO | archiveworker.quiz_archive_job::<b13a561c-3a03-11ef-844a-0242ac160002> | Generated "attempt-2034-teststudent1_2024-07-04-14-48-21"
Ok this is odd. I tested it by injecting exactly the HTML you sent me via mail and it worked. However, now I'm getting the same blank page problem. Maybe some sort of race condition with Playwright ...
Sorry for the back and forth. I'll look into it and try to nail it down further.
Just to make sure: Exporting attempts does work when boost is selected as the Moodle theme, correct?
Oh and by the way: Thanks a lot for your quick responses :+1:
No worries, we're mostly using Boost anyway on our production exam platforms, like I stated yesterday in my private message to you. And yes, exporting when Boost is selected works as expected.
To make matters even more complicated (sorry), I'm testing this on both Moodle 4.2 and 4.4. As stated in the initial report above, all I discussed here is related to Moodle 4.2, but in Moodle 4.4 (with my theme only β Boost works fine!) I have a complete fail with the following logs:
(let me know whether I should open a separate issue for this)
[2024-07-04 12:55:06,115] | DEBUG | archiveworker.quiz_archive_job::<a2238332-3a04-11ef-b9ab-0242ac160002> | Skipping HTML DOM saving of quiz attempt 2034
[2024-07-04 12:55:06,823] | DEBUG | archiveworker.quiz_archive_job::<a2238332-3a04-11ef-b9ab-0242ac160002> | Playwright DOM content loaded
[2024-07-04 12:55:06,828] | DEBUG | archiveworker.quiz_archive_job::<a2238332-3a04-11ef-b9ab-0242ac160002> | Injecting JS to wait for page rendering ...
[2024-07-04 12:55:06,838] | DEBUG | archiveworker.quiz_archive_job::<a2238332-3a04-11ef-b9ab-0242ac160002> | Waiting for ready signal: x-quiz-archiver-page-ready-for-export
[2024-07-04 12:55:07,150] | DEBUG | archiveworker.quiz_archive_job::<a2238332-3a04-11ef-b9ab-0242ac160002> | Playwright console message: Starting Moodle session timeout warning.
[2024-07-04 12:55:07,177] | WARNING | archiveworker.quiz_archive_job::<a2238332-3a04-11ef-b9ab-0242ac160002> | Prevented belated redirection to: https://mymoodle_v44/login/index.php
[2024-07-04 12:55:07,181] | DEBUG | archiveworker.quiz_archive_job::<a2238332-3a04-11ef-b9ab-0242ac160002> | Playwright request failed: https://mymoodle_v44/login/index.php
[2024-07-04 12:55:07,215] | DEBUG | archiveworker.quiz_archive_job::<a2238332-3a04-11ef-b9ab-0242ac160002> | Playwright DOM content loaded
[2024-07-04 12:55:21,847] | ERROR | archiveworker.quiz_archive_job::<a2238332-3a04-11ef-b9ab-0242ac160002> | Ready signal not received after 15 seconds. Aborting ...
[2024-07-04 12:55:21,905] | ERROR | archiveworker.quiz_archive_job::<a2238332-3a04-11ef-b9ab-0242ac160002> | Job failed with error: RuntimeError:
The problem is that I can't see the actual RuntimeError thrown, so I can't investigate further.
Thanks for clearing things up a little. Good to know that at least with boost everything works as expected. However, I'd still like to get this working independently of the theme.
I have a complete fail with the following logs
To me it seems that the redirect behavior is a little different between both Moodle versions. However, any form of redirect will interfere with the "ready signal" detection mechanism since this works by injecting additional javascript into the page context that checks if everything is rendered as expected. If a navigation / redirection happens after this is injected, the ready signal is never sent, resulting in the observed timeout.
If you disable waiting for the ready signal, you should get an export (most likely still blank though ;D). You can skip the "readyness check" by setting QUIZ_ARCHIVER_WAIT_FOR_READY_SIGNAL
to an empty value: QUIZ_ARCHIVER_WAIT_FOR_READY_SIGNAL=
The problem is that I can't see the actual RuntimeError thrown, so I can't investigate further.
This is actually just an empty RuntimeError
that is raised after the ready signal not received timeout in order to abort the job at a higher level. See: https://github.com/ngandrass/moodle-quiz-archive-worker/blob/5002f5801736f34284fc0d87aae04b50117756cf/archiveworker/quiz_archive_job.py#L274
Yeah ok, so this definitely is a race condition with Playwright/Chrome.
I try to listen for the redirection request and abort it. However, depending on the speed at which the interception handler is called, the browser might already have destructed the current HTML DOM, therefore resulting in a blank page. Catching the request early and snapshotting the page in that exact moment works, but opens up another huge can of worms because the redirect can already happen way before everything is rendered completely.
Therefore, I need to look into preventing the redirect at javascript level. Sadly, due to security reasons one can neither prevent altering of window.location
during the beforeunload
event nor override window.location
to simply ignore any changes (see: https://developer.mozilla.org/en-US/docs/Web/API/Window/location).
The only way left I see is to parse every javascript snippet that is loaded during rendering and replace the redirection code directly ... That could work ...
hm... the plot thickens
Thanks for explaining again.
I qiuckly tested and in my Moodle 4.4 + custom theme, using QUIZ_ARCHIVER_WAIT_FOR_READY_SIGNAL=
(null value) yields to a working export.
Ok, I released another development version of the archive worker that directly patches the respective redirection snippets right inside the loaded javascript resources. This should prevent any such redirects before they happen, effectively eliminating the page destruction race condition.
Please be aware that this is currently still in highly experimental state. Interestingly this causes some tests to fail, but that's not critical at this point. I just want to confirm that this also solves the redirect issue on your side too.
Commit: https://github.com/ngandrass/moodle-quiz-archive-worker/commit/7f9c0c88a1ba57f59c92bc197f14338bb4d4f4f7
Docker image: ngandrass/moodle-quiz-archive-worker:1.3.10-dev-202407041337
https://hub.docker.com/layers/ngandrass/moodle-quiz-archive-worker/1.3.10-dev-202407041728/images/sha256-279c9b44b906f2784b8a65b0124e424cfab7034ed69369745ead32d9bf7376ae?context=explore
Ok, so I replaced 1337
by 1728
in the Docker image you mentioned, thus obtaining: image: ngandrass/moodle-quiz-archive-worker:1.3.10-dev-202407041728
in my docker-compose.yml
.
And I'm happy to report that it works, with my custom theme, both on Moodle 4.2 and on Moodle 4.4!
Also, I removed the option QUIZ_ARCHIVER_WAIT_FOR_READY_SIGNAL=
(null value) before both tests.
Let me know if you need any more details.
Great news! Thanks for testing this :tada:
Since this is a quite intrusive feature now, I'd like to make it opt-in instead of opt-out. If more and more universities face such problems we can still make it the default behavior with a single change of the config default value.
Moreover I'd like to investigate the now failing unit tests a little more. On first glance it seems that the javascript interceptor simply blocks due to the missing Moodle counterpart within the unit tests.
Since this is non-urgent I finalize the above in the next week since I got some other tasks for this week on my desk. I'll comment here once the release is out :)
I polished the code for the fix and pushed it to the develop branch. I'll now look at #24 and will do a combined release for both the Moodle plugin as well as the worker service today.
Everything is released now. You can simply upgrade your development version of the archive worker to ngandrass/moodle-quiz-archive-worker:1.4.0
or ngandrass/moodle-quiz-archive-worker:latest
.
Thanks again a lot for reporting the issue and testing the potential fixes :+1:
Thanks a lot. I just checked and can confirm that everything now works fine on my end too.
Quiz archiver moodle plugin version
1.2.8
Quiz archiver worker version
image: ngandrass/moodle-quiz-archive-worker:latest
Worker service deployment method
Docker compose
Moodle Version
4.2.8 (Build: 20240610)
PHP Version
PHP 8.1.29
Database
MariaDB
Operating system
Debian 12
What happened?
Expected behavior: PDF files in quiz-archive-...-tar.gz/attempts/attempt-..../attempt-...pdf should contain a PDF rendering of the attempt review
Actual behavior: the PDF files contain only a PDF rendering of the Moodle platform login page
Additional information:
Steps to reproduce
Relevant Moodle / PHP log output (if applicable)
Relevant quiz archive worker service log output (if applicable)