Closed AndrewThien closed 4 months ago
Someone is attempting to deploy a commit to a Personal Account owned by @Elliott-Chong on Vercel.
@Elliott-Chong first needs to authorize it.
Hi @AndrewThien I followed the changes in your PR, Connecting langchain directly to S3 seems to work fine. But i keep getting the below error:
Also i am using The endpoint for the hosted Unstructured API: https://api.unstructured.io/general/v0/general
Error: Failed to load file C:\Users\Onukw\AppData\Local\Temp\s3fileloader-w4heR3\uploads\1708562531104my-testing-pdf using unstructured loader. at S3Loader.load
Hi @izuchukwu-eric ,
Thanks for the comment.
In this code
const loader = new S3Loader({ bucket: process.env.NEXT_PUBLIC_S3_BUCKET_NAME!, key: file_key, s3Config: { region: "eu-west-2", credentials: { accessKeyId: process.env.NEXT_PUBLIC_S3_ACCESS_KEY_ID!, secretAccessKey: process.env.NEXT_PUBLIC_S3_SECRET_ACCESS_KEY!, }, }, unstructuredAPIURL: process.env.UNSTRUCTURED_API_URL!, unstructuredAPIKey: process.env.UNSTRUCTURED_API_KEY!, });
Did you put the API key value for the unstructuredAPIkey? if so have you updated it recently? I believe Unstructured has some changes in Free API key access recently. Please let me know if you still encounter the problem. Thanks!
Hi @izuchukwu-eric , Thanks for the comment. In this code
const loader = new S3Loader({ bucket: process.env.NEXT_PUBLIC_S3_BUCKET_NAME!, key: file_key, s3Config: { region: "eu-west-2", credentials: { accessKeyId: process.env.NEXT_PUBLIC_S3_ACCESS_KEY_ID!, secretAccessKey: process.env.NEXT_PUBLIC_S3_SECRET_ACCESS_KEY!, }, }, unstructuredAPIURL: process.env.UNSTRUCTURED_API_URL!, unstructuredAPIKey: process.env.UNSTRUCTURED_API_KEY!, });
Did you put the API key value for the unstructuredAPIkey? if so have you updated it recently? I believe Unstructured has some changes in Free API key access recently. Please let me know if you still encounter the problem. Thanks!
No i didn't put the API, Pls how do i get the API?
okay, I see the problem now. Please go to this website https://unstructured.io/api-key-free , then register for an API key then put it where it should in the code. You will be hopefully able to do it. Let me know if it helps!
@Elliott-Chong What are the necessary environment variables for the project?
As the original logic suggested, before segmenting and vectorising the PDF file, the program needs to download the PDF file, store it in a temp directory then read it by using Langchain, this may cause the error: ENOENT: no such file or directory (especially when deploying the project to Vercel, because Vercel has Serverless functions). So, to solve this problem, I suggest letting Langchain connect directly to the S3 bucket and read from there, using S3Loader and Unstructured (more here). This will prevent downloading files to the temp directory (which is hard/ impossible to configure on Vercel) and prevent the mentioned error. Hope this helps