mozilla / pdf.js

PDF Reader in JavaScript
https://mozilla.github.io/pdf.js/
Apache License 2.0
47.09k stars 9.81k forks source link

Avoid the integration tests creating PDF files in the downloads folder #18274

Open timvandermeij opened 1 week ago

timvandermeij commented 1 week ago

If the integration tests are run locally (this likely also happens on the bot but we don't really notice it there) some PDF files are stored in the downloads folder as a side-effect of e.g. the autoprint tests. This was already a bit suboptimal because tests should not leave files on disk as a side-effect in general, but it looks like after #18271 the number of PDF files increased quite a bit. I think this is because we now actually clean up the viewer correctly, which causes all documents with forms in them to be auto-saved on closing of the viewer.

We should check if there is a way to avoid this. Maybe we can tell the viewer to discard the form changes via an existing API or include an extra parameter save=false in the close method or so, or at the very least maybe configure the browser (via Puppeteer configuration) to store such files in /tmp so they are not in the way and are usually cleaned automatically. It might have to be a combination of both because I'm not sure if we can prevent the autoprint tests from doing this in the first place (on the other hand, maybe https://github.com/mozilla/pdf.js/pull/17962/files#diff-c0973b2c3b147db1eccd21c6b1af307f33a7e5d2658b4a0738a2386470b9c0caR1792 takes care of that?).

Another reason why this is more of a concern now that more files are being created is that it might fill up the disk on the bots more quickly over time than before, which would require manual intervention to fix.

Snuffleupagus commented 1 week ago

What if we simply added && !TESTING to the condition in https://github.com/mozilla/pdf.js/blob/4c041586fb4c314a479f464096f5227003202935/web/app.js#L922

timvandermeij commented 1 week ago

It looks like that might just work, and it'd be a very simple change. That'd at least address the concern of the newly added files since the mentioned commit, and we can deal with the autoprint files in another change.

timvandermeij commented 1 week ago

The majority of the issue is fixed by #18282. For this issue only the print tests remain. From a quick search it looks like both Chrome and Firefox have a way to configure the default download directory, so if we can't actually disable the printing action itself (which would be preferable) then probably the best we can do is point that to /tmp on Linux (and whatever the Windows equivalent is) so at least the downloads directory is not filled anymore and the files are more likely to be cleaned up.

timvandermeij commented 5 days ago

I found out that Firefox actually does it correctly and doesn't store anything in the downloads directory; we configured this here: https://github.com/mozilla/pdf.js/blob/master/test/test.mjs#L930-L938

The problem is limited to Chrome for which, even though it does have a preference for this called download.default_directory, we can't actually set preferences through Puppeteer. We can only give CLI arguments in the options.args field, but the download directory doesn't seem to be exposed though that mechanism. There is sadly no equivalent of the options.extraPrefsFirefox field we use for Firefox.

The only option I have been able to find in the Puppeteer issue tracker is https://github.com/puppeteer/puppeteer/issues/923#issuecomment-333358583, for which we'd have to prepare a custom preferences file in the profile directory. This feels hacky, I can't really find good documentation on e.g. the format of said file and the default file contains all kinds of entries by default in there so it's not clear to me what happens if we don't provide those.

We could try this approach, but it seems much better if Puppeteer had an option similar to options.extraPrefsFirefox to pass preferences to Chrome in puppeteer.launch() calls so we don't need to have knowledge of implementation details of Chrome's profile directory. I'll file a request for that in the Puppeteer repository.