harshankur / officeParser

A Node.js library to parse text out of any office file. Currently supports docx, pptx, xlsx and odt, odp, ods..
MIT License
135 stars 17 forks source link

There is an issue when I install "officeparser" with the document loader pptx, terminating all my tRPC API. #25

Open hozaifa4you opened 9 months ago

hozaifa4you commented 9 months ago

There seems to be an issue ⚠ with loading the langchain document and the officeparser package.

Everything is running smoothly with my tRPC APIs, except for one issue I encountered while attempting to load a PowerPoint file using the langchain document loader. To fix the problem, I installed officeparser, but this caused all my tRPC APIs to stop working. Removing 'officeparser' resolved the issue and my tRPC APIs worked without any problems. However, when officeparser was installed, I received a 404 error message from tRPC.

There is an error in my terminal also

TypeError: s is not a function
at C:\Users\Binary Gadget\Desktop\officeai\node_modules\next\dist\compiled\next-server\app-page.runtime.dev.js:13:5619
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async tE (C:\Users\Binary Gadget\Desktop\officeai\node_modules\next\dist\compiled\next-server\app-page.runtime.dev.js:13:4860)
at async tV (C:\Users\Binary Gadget\Desktop\officeai\node_modules\next\dist\compiled\next-server\app-page.runtime.dev.js:13:32468)
at async doRender (C:\Users\Binary Gadget\Desktop\officeai\node_modules\next\dist\server\base-server.js:1294:26)
at async cacheEntry.responseCache.get.incrementalCache.incrementalCache (C:\Users\Binary Gadget\Desktop\officeai\node_modules\next\dist\server\base-server.js:1446:28)
at async DevServer.renderToResponseWithComponentsImpl (C:\Users\Binary Gadget\Desktop\officeai\node_modules\next\dist\server\base-server.js:1361:28)
at async DevServer.renderErrorToResponseImpl (C:\Users\Binary Gadget\Desktop\officeai\node_modules\next\dist\server\base-server.js:1854:24)
at async DevServer.pipeImpl (C:\Users\Binary Gadget\Desktop\officeai\node_modules\next\dist\server\base-server.js:826:25)
at async DevServer.handleCatchallRenderRequest (C:\Users\Binary Gadget\Desktop\officeai\node_modules\next\dist\server\next-server.js:623:13)
at async DevServer.handleRequestImpl (C:\Users\Binary Gadget\Desktop\officeai\node_modules\next\dist\server\base-server.js:728:17)

My package.json file information:


{
"name": "officeai",
"version": "0.1.0",
"description": "This is office utils SaaS application with AI",
"private": true,
"scripts": {
"dev": "next dev",
"build": "next build",
"start": "next start",
"lint": "next lint",
"postinstall": "prisma generate"
},
"dependencies": {
"@hookform/resolvers": "^3.3.2",
"@mantine/hooks": "^7.0.1",
"@pinecone-database/pinecone": "^1.0.1",
"@prisma/client": "^5.3.1",
"@radix-ui/react-avatar": "^1.0.4",
"@radix-ui/react-dialog": "^1.0.5",
"@radix-ui/react-dropdown-menu": "^2.0.5",
"@radix-ui/react-hover-card": "^1.0.7",
"@radix-ui/react-label": "^2.0.2",
"@radix-ui/react-progress": "^1.0.3",
"@radix-ui/react-scroll-area": "^1.0.5",
"@radix-ui/react-separator": "^1.0.3",
"@radix-ui/react-slot": "^1.0.2",
"@radix-ui/react-toast": "^1.1.4",
"@radix-ui/react-tooltip": "^1.0.7",
"@sendgrid/mail": "^8.1.0",
"@tailwindcss/typography": "^0.5.10",
"@tanstack/react-query": "^4.35.3",
"@trpc/client": "^10.38.4",
"@trpc/next": "^10.38.4",
"@trpc/react-query": "^10.38.4",
"@trpc/server": "^10.38.4",
"ai": "^2.2.13",
"autoprefixer": "10.4.16",
"bcryptjs": "^2.4.3",
"class-variance-authority": "^0.7.0",
"clsx": "^2.0.0",
"date-fns": "^2.30.0",
"eslint-config-next": "13.5.2",
"jsonwebtoken": "^9.0.2",
"langchain": "^0.0.213",
"lodash": "^4.17.21",
"lucide-react": "^0.299.0",
"next": "^13.5.6",
"next-auth": "^4.24.5",
"openai": "^4.10.0",
"pdf-parse": "^1.1.1",
"pdfjs-dist": "^4.0.269",
"pdfreader": "^3.0.2",
"postcss": "^8.4.32",
"prisma": "^5.3.1",
"puppeteer": "^21.6.1",
"react": "18.2.0",
"react-dom": "18.2.0",
"react-dropzone": "^14.2.3",
"react-hook-form": "^7.49.2",
"react-loading-skeleton": "^3.3.1",
"react-markdown": "^8.0.7",
"react-pdf": "^7.3.3",
"react-resize-detector": "^9.1.0",
"react-textarea-autosize": "^8.5.3",
"simplebar-react": "^3.2.4",
"stripe": "^13.7.0",
"tailwind-merge": "^1.14.0",
"tailwindcss": "3.3.3",
"tailwindcss-animate": "^1.0.7",
"typescript": "5.2.2",
"zod": "^3.22.4",
"zustand": "^4.4.7"
},
"devDependencies": {
"@types/bcryptjs": "^2.4.6",
"@types/jsonwebtoken": "^9.0.5",
"@types/lodash": "^4.14.202",
"@types/node": "20.6.3",
"@types/react": "18.2.22",
"@types/react-dom": "18.2.7",
"@typescript-eslint/eslint-plugin": "^6.14.0",
"@typescript-eslint/parser": "^6.14.0",
"eslint": "^8.56.0",
"eslint-config-prettier": "^9.1.0",
"eslint-plugin-react": "^7.33.2",
"prettier": "3.1.1"
}
}


### Please give me enough information to solve the error or how to solve the error. πŸ™πŸ™πŸ™πŸ₯ΊπŸ₯Ί
Keaanm commented 7 months ago

did you add office parser to the server components external packages in your nextjs config file? const nextConfig = { experimental: { serverComponentsExternalPackages: ["officeparser"], }, };

hozaifa4you commented 7 months ago

did you add office parser to the server components external packages in your nextjs config file? const nextConfig = { experimental: { serverComponentsExternalPackages: ["officeparser"], }, };

Is it recommended??

Keaanm commented 7 months ago

Yes, the config is needed for the package to work in nextjs.

jiaweing commented 7 months ago

This solved my issue on Next.js 14 running Langchain PPTX loader.

My error was different though: Error: ENOENT: no such file or directory, open './test/data/05-versions-space.pdf'

Even though it was related to pdf-parse but this fixed it!

did you add office parser to the server components external packages in your nextjs config file? const nextConfig = { experimental: { serverComponentsExternalPackages: ["officeparser"], }, };