Closed qinyuhua closed 5 years ago
You should always specify the workerSrc explicitly, i.e. by setting pdfjsLib.GlobalWorkerOptions.workerSrc
before calling pdfjsLib.getDocument
, since the fallback is only a best effort solution which is not guaranteed to work correctly in every situation.
You should try this:
const pdfjs = await import('pdfjs-dist/build/pdf');
const pdfjsWorker = await import('pdfjs-dist/build/pdf.worker.entry');
pdfjs.GlobalWorkerOptions.workerSrc = pdfjsWorker;
...
You should try this:
const pdfjs = await import('pdfjs-dist/build/pdf'); const pdfjsWorker = await import('pdfjs-dist/build/pdf.worker.entry'); pdfjs.GlobalWorkerOptions.workerSrc = pdfjsWorker; ...
I have had difficulty using the idea you gave in react
the problem is it doesnt work when the component is mounted
when I use the official script everything is fine but it doesnt work with pdfjs-dist
const pdfjsLib = window['pdfjs-dist/build/pdf
pdfjsLib.GlobalWorkerOptions.workerSrc = '//mozilla.github.io/pdf.js/build/pdf.worker.js';
any idea for that? I would prefer not to use the script
Sincerely I would use a react abstraction over pdf's since pdf.js officially doesn't support react :(
Please help, I still get this error when I use the library with svelte-kit
Please help, I still get this error when I use the library with svelte-kit
What I did was, to append the script at the very beginning when my component is mounted. I guess for your case you need to use the script along side with the other ones you're using.
https://mozilla.github.io/pdf.js/build/pdf.js
when you use this, the GlobalWorkerOptions
would become available.
I hope it helps, although I used it in React.
Thanks for the information, so I cannot just import it like a regular package?
Thanks for the information, so I cannot just import it like a regular package?
no unfortunately, I tried it that way, it didnt work
okey I think I know how to fixed it know
I have been stuck with this problem using ver 2.4.456 of pdfjs-dist, checked the library webpack file at root So done this to fix in React component - import PdfjsWorker from "pdfjs-dist/build/pdf.worker.js"; import PDFJS, { getDocument } from "pdfjs-dist"; PDFJS.workerSrc = "pdfjs-dist/build/pdf.worker.js"; PDFJS.GlobalWorkerOptions.workerSrc = "pdfjs-dist/build/pdf.worker.js"; PDFJS.GlobalWorkerOptions.workerPort = new PdfjsWorker();
has anyone faced error FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
when using
import { pdfjs } from 'react-pdf'
import pdfjsWorker from 'pdfjs-dist/build/pdf.worker.entry'
pdfjs.GlobalWorkerOptions.workerSrc = pdfjsWorker
react-scripts build
and react-scripts start
just crashes when error mentioned above. most importantly it only occur when i enable source map.
has anyone faced error
FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
when usingimport { pdfjs } from 'react-pdf' import pdfjsWorker from 'pdfjs-dist/build/pdf.worker.entry' pdfjs.GlobalWorkerOptions.workerSrc = pdfjsWorker
react-scripts build
andreact-scripts start
just crashes when error mentioned above. most importantly it only occur when i enable source map.
I fixed it as follows:
1) npm run eject
2) open {project name folder}/config/webpack.config.js
3) Added the exception/node_modules\/pdfjs-dist/
here:
module: {
strictExportPresence: true,
rules: [
// Handle node_modules packages that contain sourcemaps
shouldUseSourceMap && {
enforce: 'pre',
exclude: [/@babel(?:\/|\\{1,2})runtime/, /node_modules\/pdfjs-dist/],
test: /\.(js|mjs|jsx|ts|tsx|css)$/,
loader: require.resolve('source-map-loader'),
},
...
@kvengl npm run eject
is dangerous imo. it removes the react-script
wrapper and now you are on your own if something goes wrong. e.g. dependency conflicts. react-scripts
is maintain by a big community and you can trust new versions. I've always run into the issues when dealing directly with webpack.
I have fixed it with simple solution. you can check https://github.com/mozilla/pdf.js/issues/8305
People coming for ANGULAR fix:
In imports:
import * as PDFJS from 'pdfjs-dist';
// @ts-ignore
import pdfjsWorker from 'pdfjs-dist/build/pdf.worker.entry'
PDFJS.GlobalWorkerOptions.workerSrc = pdfjsWorker;
and at the loading pdf end:
private async loadPDF() {
const pdf = await PDFJS.getDocument(this.document.file).promise;
const page = await pdf.getPage(1);
const viewport = page.getViewport({ scale: 1 });
this.canvas.nativeElement.width = viewport.width;
this.canvas.nativeElement.height = viewport.height;
this.context = this.canvas.nativeElement.getContext('2d')!;
const renderContext = {
canvasContext: this.context,
viewport: viewport
};
await page.render(renderContext).promise;
this.isLoading = false;
}
This will resolve your issue right away, but now you may face Warning: Setting up fake worker.
console warning.
to resolve this update your imports for PDFjs to just:
import * as PDFJS from 'pdfjs-dist';
PDFJS.GlobalWorkerOptions.workerSrc = `//cdnjs.cloudflare.com/ajax/libs/pdf.js/${PDFJS.version}/pdf.worker.js`;
Angular: v15.1.0 pdfjs-dist: v3.3.122
For me, what seemed to work was this:
import "pdfjs-dist/build/pdf.worker.entry";
It appears that this code will attach the pdfJsworker
to the window object that I presume is what gets fallen back on when you run getDocument
. No shade, but I'm very curious about the design decision or technical constraint behind needing to do this.
@z0d14c was right. Take a look file pdfjs-dist/build/pdf.worker.entry.js
(typeof window !== "undefined"
? window
: {}
).pdfjsWorker = require("./pdf.worker.js");
the worker already assigned to window.pdfjsWorker
. You only need an import statement, no need any assign statement.
import 'pdfjs-dist/build/pdf.worker.entry';
Everything went back to normal by doing it this way for me. Thanks to luistak who helped me a lot
import * as pdfjsLib from "pdfjs-dist";
// Import the worker correctly to avoid the message "Warning: Setting up fake worker"
import pdfjsWorker from "pdfjs-dist/build/pdf.worker.entry";
pdfjsLib.GlobalWorkerOptions.workerSrc = pdfjsWorker;
the "Warning: Setting up fake worker" message came back, but I'll leave the idea in case it helps.
// It will work Now use this
import { pdfjs } from 'react-pdf';
pdfjs.GlobalWorkerOptions.workerSrc = //unpkg.com/pdfjs-dist@${pdfjs.version}/build/pdf.worker.min.js;
For me, what seemed to work was this:
import "pdfjs-dist/build/pdf.worker.entry";
It appears that this code will attach the
pdfJsworker
to the window object that I presume is what gets fallen back on when you rungetDocument
. No shade, but I'm very curious about the design decision or technical constraint behind needing to do this.
Amazing, it worked for me on a ReactJS & TypeScript project!
It's specified in the document. https://github.com/wojtekmaj/react-pdf#configure-pdfjs-worker
I came across this issue while trying to use pdfJS
in a svelte project and I had to do something slightly different:
import * as pdfjsLib from 'pdfjs-dist/build/pdf';
// this import is needed in to configure a default worker for pdfjs
import * as pdfjsWorker from "pdfjs-dist/build/pdf.worker.mjs";
pdfjsLib.GlobalWorkerOptions.workerSrc = pdfjsWorker;
Which got past the initial error but still gets the setting up fake worker
warning, however it actual functionality seems to be working fine.
I came across this issue while trying to use
pdfJS
in a svelte project and I had to do something slightly different:import * as pdfjsLib from 'pdfjs-dist/build/pdf'; // this import is needed in to configure a default worker for pdfjs import * as pdfjsWorker from "pdfjs-dist/build/pdf.worker.mjs"; pdfjsLib.GlobalWorkerOptions.workerSrc = pdfjsWorker;
Which got past the initial error but still gets the
setting up fake worker
warning, however it actual functionality seems to be working fine.
Thanks a lot! This is the only solution that worked for me on React.
For me, what seemed to work was this:
import "pdfjs-dist/build/pdf.worker.entry";
This got my Angular app working.
When I upgraded to Angular v17 and pdf.js v4, I had to change it to:
import "pdfjs-dist/build/pdf.worker.mjs";
Has anyone figured out this error with pdf dist and angular 17
./node_modules/pdfjs-dist/build/pdf.mjs - Error: Module parse failed: The top-level-await experiment is not enabled (set experiments.topLevelAwait: true to enabled it)
File was processed with these loaders:
The solution from react-pdf the docs works and gets rid of the setting up fake worker
warning.
import * as pdfjs from 'pdfjs-dist'
pdfjs.GlobalWorkerOptions.workerSrc = new URL(
'pdfjs-dist/build/pdf.worker.min.mjs',
import.meta.url
).toString()
However, I could see this encountering build issues unless you include the minified file in your build pipeline.
Found a solution with Angular wich also resolves the setting up fake worker
warning:
assets
area in the angular.json
:{
"glob": "pdf.worker.min.mjs",
"input": "./node_modules/pdfjs-dist/build",
"output": "./assets"
}
Then somewhere globally in your code (I do it in app.module.ts
), add the following lines:
import * as pdfjsDist from 'pdfjs-dist';
pdfjsDist.GlobalWorkerOptions.workerSrc = 'assets/pdf.worker.min.mjs';
Now the worker runs smoothly and no warning appears without using a CDN 🎉
Found a solution with Angular wich also resolves the
setting up fake worker
warning:
- add following to the
assets
area in theangular.json
:{ "glob": "pdf.worker.min.mjs", "input": "./node_modules/pdfjs-dist/build", "output": "./assets" }
Then somewhere globally in your code (I do it in
app.module.ts
), add the following lines:import * as pdfjsDist from 'pdfjs-dist'; pdfjsDist.GlobalWorkerOptions.workerSrc = 'assets/pdf.worker.min.mjs';
Now the worker runs smoothly and no warning appears without using a CDN 🎉
This works on yarn, thank you!
For me, what seemed to work was this:
import "pdfjs-dist/build/pdf.worker.entry";
It appears that this code will attach the
pdfJsworker
to the window object that I presume is what gets fallen back on when you rungetDocument
. No shade, but I'm very curious about the design decision or technical constraint behind needing to do this.
Thankyou @z0d14c, this worked for me. It does attach pdfJsWorker
to the window. I'm also very interested in why this works.
import "pdfjs-dist/build/pdf.worker.mjs";
// Function to extract text from a PDF file
export async function extractTextFromPDF(file: File): Promise<string[]> {
// dynamic import to avoid ssr issues
const { getDocument } = await import("pdfjs-dist");
// Read the file as an array buffer
const arrayBuffer = await file.arrayBuffer();
// Load the PDF document
const pdfDocument = await getDocument({ data: arrayBuffer }).promise;
const pagesText: string[] = [];
// Loop through each page
for (let i = 0; i < pdfDocument.numPages; i++) {
const page = await pdfDocument.getPage(i + 1); // Pages are 1-based in pdfjs
// Get the text content of the page
const textContent = await page.getTextContent();
// Extract and join the text items into a single string
const pageText = textContent.items
.map((item) => (item as any).str)
.join(" ");
pagesText.push(pageText);
}
return pagesText;
}
Anyone Knows how to fix this with Vue3? Please I've been stuck on this for 1 week! It's so rare It works for me on my localhost, but when I deploy it into Amplify and use it throught the domain url got this error back
index-DGRtMnYU.js:202 Error loading PDF: Error: Setting up fake worker failed: "Failed to fetch dynamically imported module: https://test/node_modules/pdfjs-dist/build/pdf.worker.mjs".
at index-DGRtMnYU.js:192:171740
@alvaroman23 It's telling you what the problem is. It's trying to import a module from https://test/
which doesn't exist.
Have a look at this thread. https://github.com/vitejs/vite/issues/11804
In one case it turned out to be an ad blocker that was preventing an import. However, in your case it looks like it's assuming that your path the node module is a URL.
Also, look further up in this thread, because I posted a solution that I was using with Vue 3 + Vite.
Anyone Knows how to fix this with Vue3? Please I've been stuck on this for 1 week! It's so rare It works for me on my localhost, but when I deploy it into Amplify and use it throught the domain url got this error back
index-DGRtMnYU.js:202 Error loading PDF: Error: Setting up fake worker failed: "Failed to fetch dynamically imported module: https://test/node_modules/pdfjs-dist/build/pdf.worker.mjs". at index-DGRtMnYU.js:192:171740
Same error here and searching for fix.
import "pdfjs-dist/build/pdf.worker.mjs";
So thank you, I apply this import in my reactjs project and it's successful :D
At the writing of this, I think the right solution for integrating pdfjs-dist
with Vite, is this one:
"vite": "^5.3.4",
"pdfjs-dist": "^4.4.168",
import workerSrc from 'pdfjs-dist/build/pdf.worker?worker&url'
import * as pdfjs from 'pdfjs-dist'
pdfjs.GlobalWorkerOptions.workerSrc = workerSrc
pdfjs.getDocument('/bitcoin.pdf').promise.then((pdf) => {
console.log(pdf);
})
https://vitejs.dev/guide/assets.html#importing-script-as-a-worker
Webpack should be able to also do something like this https://webpack.js.org/guides/web-workers/
For Vite + Sveltekit
import * as pdfjs from 'pdfjs-dist/build/pdf'
import 'pdfjs-dist/build/pdf.worker.entry'
if (typeof window !== 'undefined' && 'Worker' in window) {
pdfjs.GlobalWorkerOptions.workerSrc = new URL(
'pdfjs-dist/build/pdf.worker.min.js',
import.meta.url
).toString()
}
For plain JS, I had to use the following code:
import * as pdfjsworker from "pdfjs-dist/build/pdf.worker.mjs"
import * as pdfjs from "pdfjs-dist/build/pdf.min.mjs"
pdfjs.workerSrc = pdfjsworker;
let loadingTasks = pdfjs.getDocument(fileUrl);
let fileCheckPromise = await loadingTasks.promise;
This gave me an Invalid 'workerSrc' type
error:
pdfjs.GlobalWorkerOptions.workerSrc = workerSrc
if (!fallbackWorkerSrc && typeof document !== 'undefined') { var pdfjsFilePath = document.currentScript && document.currentScript.src; if (pdfjsFilePath) { fallbackWorkerSrc = pdfjsFilePath.replace(/(.(?:min.)?js)(\?.*)?$/i, '.worker$1$2'); } } Sometimes “document.currentScript” === null, pdfjsFilePath === null,
function getWorkerSrc() { if (_worker_options.GlobalWorkerOptions.workerSrc) { return _worker_options.GlobalWorkerOptions.workerSrc; } if (typeof fallbackWorkerSrc !== 'undefined') { return fallbackWorkerSrc; } throw new Error('No "GlobalWorkerOptions.workerSrc" specified.'); }