Open TomElliottFlexion opened 3 months ago
Our version of pdfjs-dist
is now being flagged for a high severity vulnerability "PDF.js vulnerable to arbitrary JavaScript execution upon opening a malicious PDF - https://github.com/advisories/GHSA-wgrm-67xf-hhpq"
Perhaps we should reconsider the urgency of this card?
Regarding the security vulnerability.. Our version of pdfjs-dist is now being flagged for a high severity vulnerability "PDF.js vulnerable to arbitrary JavaScript execution upon opening a malicious PDF - https://github.com/advisories/GHSA-wgrm-67xf-hhpq"
We handled it here: https://app.zenhub.com/workspaces/flexionef-cms-5bbe4bed4b5806bc2bec65d3/issues/gh/flexion/ef-cms/10407
Some helpful notes for whoever picks this up in the future...
Correctly dynamic imports:
Setting tsconfig to use modules via NodeNext, seems to allow it to support the latest version, however it's unknown what side effects may occur.
As an engineer, so that I can keep DAWSON secure and up-to-date, I need to update pdfjs-dist to version ^3.x.x.
Pre-Conditions
See Notes section below for what's been tried already.
Acceptance Criteria
Pain Avoided/Frustration Saved
Breadth/Pervasiveness of Problem
Complexity of Problem (Low, Medium, High) and Why it's Complex
Notes
Last week (Nov 30th), Rosie, Tim, and Rachel spent a few days trying to upgrade pdfjs-dist. Below are the lessons learned from that.
Latest version of pdfjs-dist is
3.1.81
, DAWSON is currently running on2.16.105
The error we see when upgrading without making any other code changes ONLY appears on a deployed environment, not locally.
The error we see when upgrading without making any other code changes occurs when uploading a court issued PDF, that is because the only place we use this package is to OCR uploaded court issued documents.
There is some conflicting documentation available about how to use this package, on one hand, Mozilla indicates that the legacy build of pdfjs-dist is used to support Node environments. On the other hand, there are special instructions for using the package with webpack.
When we tried importing the package using the webpack instructions one of the errors we observed was
Error scraping PDF with PDF.JS v3.1.81 structuredClone is not defined
. This COULD potentially be resolved by upgrading to Node 17+ where structured clone is supported. Note there is no guarantee this would fix the pdfjs-dist errors, just resolve the structuredClone error.The error we see when upgrading without making any other code changes is: `Error scraping PDF with PDF.JS vundefined Cannot read properties of undefined (reading 'prototype')'
The error we see when upgrading and changing the way we import to use the recommended webpack use, is:
Error scraping PDF with PDF.JS v3.1.81 Setting up fake worker failed: "Cannot find module 'canvas'