Closed geekroscom closed 1 year ago
import { PDFLoader } from "langchain/document_loaders/fs/pdf";
const loader = new PDFLoader("D:\\202305\\Chat_Desktop\\example_data\\Analysis-and-Comparison-between-Optimism-and-StarkNet.pdf", {
pdfjs: () => import("pdfjs-dist/legacy/build/pdf.worker.js"),
});
docs = await loader.load();
Error reporting after using pdf.worker.js:
Uncaught (in promise) TypeError: getDocument is not a function
at PDFLoader.parse (D:\202305\Chat_Desktop\release\dist\preload\index.cjs:2924:27)
at async onImport (dialog.vue:159:36)
parse @ D:\202305\Chat_Desktop\release\dist\preload\index.cjs:2924
Promise.catch (async)
callWithAsyncErrorHandling @ runtime-core.esm-bundler.js:184
emit @ runtime-core.esm-bundler.js:730
(anonymous) @ runtime-core.esm-bundler.js:7465
handleClick @ button.vue:116
callWithErrorHandling @ runtime-core.esm-bundler.js:173
callWithAsyncErrorHandling @ runtime-core.esm-bundler.js:182
invoker @ runtime-dom.esm-bundler.js:345
I have resolved it for now, but I'm not sure about the correct method for releasing it.
yarn add pdf-parse && yarn add pdfjs-dist
import * as PDFLib from "pdfjs-dist/legacy/build/pdf.js";
const loader = new props.app.api.LangChain.PDFLoader(props.app.page.database.dialog.form.file_value, {
pdfjs: ()=>{
PDFLib.GlobalWorkerOptions.workerSrc = "https://cdn.*****.com/****/pdf.worker.min.js";
return PDFLib;
},
});
docs = await loader.load();
The pdf.worker.min.js file is obtained from the pdfjs-dist/legacy/build/ directory, uploaded to your own server, and then the PDFLib.GlobalWorkerOptions.workerSrc is set to that file.
LangChain Version:0.0.75 Development Environment:Vue3+Vite+Ts+Electron
My usage process is as follows:
Error: