Open radvanyimome opened 4 weeks ago
My hunch is that this issue might be related to to commit https://github.com/langchain-ai/langchain/pull/20663 against metadata about the document gets lost during the downloading process to temp storage. I'm not entirely sure of the root cause, but it's a tricky problem that might need more eyes on it. Thanks to @MacanPN for pointing this out! Any insights or further checks we could perform to better understand this would be greatly appreciated.
Checked other resources
Example Code
Error Message and Stack Trace (if applicable)
Description
System Info
Currently I am running the code on the Unstructured docker container (downloads.unstructured.io/unstructured-io/unstructured:latest) but other Linux platforms like Ubuntu 20.04 and python:3.11-slim were also fruitless. Packages like O365 and PyMuPDF were also installed.
/usr/src/app $ python -m langchain_core.sys_info
System Information
Package Information
Packages not installed (Not Necessarily a Problem)
The following packages were not found: