Cinnamon / kotaemon

An open-source RAG-based tool for chatting with your documents.
https://cinnamon.github.io/kotaemon/
Apache License 2.0
11.66k stars 856 forks source link

[BUG] - GraphRAG dependency Cargo issue during docker build #172

Open bkbas26 opened 2 weeks ago

bkbas26 commented 2 weeks ago

Description

I am trying to deploy Kotaemon application in Kubernetes. Building Docker image with the Dockerfile gives the below error.


=> ERROR [dev 4/5] RUN pip install graphrag future unstructured[all-docs] 268.4s

[dev 4/5] RUN pip install graphrag future unstructured[all-docs]: 0.990 Collecting graphrag 1.051 Downloading graphrag-0.3.2-py3-none-any.whl (382 kB) 1.280 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 382.3/382.3 kB 1.7 MB/s eta 0:00:00 1.323 Collecting future 1.346 Downloading future-1.0.0-py3-none-any.whl (491 kB) 1.593 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 491.3/491.3 kB 2.0 MB/s eta 0:00:00 1.667 Collecting unstructured[all-docs] 1.688 Downloading unstructured-0.15.9-py3-none-any.whl (2.1 MB) 2.703 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.1/2.1 MB 2.1 MB/s eta 0:00:00 2.769 Collecting aiofiles<25.0.0,>=24.1.0 2.802 Downloading aiofiles-24.1.0-py3-none-any.whl (15 kB) 2.810 Requirement already satisfied: python-dotenv<2.0.0,>=1.0.0 in /usr/local/lib/python3.10/site-packages (from graphrag) (1.0.1) 2.871 Collecting azure-search-documents<12.0.0,>=11.4.0 2.893 Downloading azure_search_documents-11.5.1-py3-none-any.whl (297 kB) 3.074 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 297.7/297.7 kB 1.6 MB/s eta 0:00:00 3.124 Collecting devtools<0.13.0,>=0.12.2 3.159 Downloading devtools-0.12.2-py3-none-any.whl (19 kB) 3.171 Requirement already satisfied: pyyaml<7.0.0,>=6.0.2 in /usr/local/lib/python3.10/site-packages (from graphrag) (6.0.2) 3.172 Requirement already satisfied: rich<14.0.0,>=13.6.0 in /usr/local/lib/python3.10/site-packages (from graphrag) (13.8.0) 3.229 Collecting azure-identity<2.0.0,>=1.17.1 3.250 Downloading azure_identity-1.17.1-py3-none-any.whl (173 kB) 3.337 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 173.2/173.2 kB 2.0 MB/s eta 0:00:00 3.380 Collecting aiolimiter<2.0.0,>=1.1.0 3.408 Downloading aiolimiter-1.1.0-py3-none-any.whl (7.2 kB) 3.467 Collecting graspologic<4.0.0,>=3.4.1 3.488 Downloading graspologic-3.4.1-py3-none-any.whl (5.2 MB) 6.776 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.2/5.2 MB 1.6 MB/s eta 0:00:00 6.825 Collecting environs<12.0.0,>=11.0.0 6.861 Downloading environs-11.0.0-py3-none-any.whl (12 kB) 6.925 Collecting azure-storage-blob<13.0.0,>=12.22.0 6.950 Downloading azure_storage_blob-12.22.0-py3-none-any.whl (404 kB) 7.203 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 404.9/404.9 kB 1.8 MB/s eta 0:00:00 7.207 Requirement already satisfied: tiktoken<0.8.0,>=0.7.0 in /usr/local/lib/python3.10/site-packages (from graphrag) (0.7.0) 7.521 Collecting fastparquet<2025.0.0,>=2024.2.0 7.551 Downloading fastparquet-2024.5.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.7 MB) 9.121 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.7/1.7 MB 1.1 MB/s eta 0:00:00 9.183 Collecting lancedb<0.12.0,>=0.11.0 9.213 Downloading lancedb-0.11.0-cp38-abi3-manylinux_2_24_aarch64.whl (22.5 MB) 25.04 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 22.5/22.5 MB 1.3 MB/s eta 0:00:00 25.06 Requirement already satisfied: pydantic<3,>=2 in /usr/local/lib/python3.10/site-packages (from graphrag) (2.8.2) 25.32 Collecting swifter<2.0.0,>=1.4.0 25.34 Downloading swifter-1.4.0.tar.gz (1.2 MB) 25.99 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 1.8 MB/s eta 0:00:00 26.03 Preparing metadata (setup.py): started 26.53 Preparing metadata (setup.py): finished with status 'done' 26.53 Requirement already satisfied: uvloop<0.21.0,>=0.20.0 in /usr/local/lib/python3.10/site-packages (from graphrag) (0.20.0) 26.57 Collecting datashaper<0.0.50,>=0.0.49 26.59 Downloading datashaper-0.0.49-py3-none-any.whl (71 kB) 26.69 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 72.0/72.0 kB 956.5 kB/s eta 0:00:00 26.69 Requirement already satisfied: numpy<2.0.0,>=1.25.2 in /usr/local/lib/python3.10/site-packages (from graphrag) (1.26.4) 26.70 Requirement already satisfied: networkx<4,>=3 in /usr/local/lib/python3.10/site-packages (from graphrag) (3.3) 26.70 Requirement already satisfied: openai<2.0.0,>=1.37.1 in /usr/local/lib/python3.10/site-packages (from graphrag) (1.43.0) 26.70 Requirement already satisfied: nltk==3.9.1 in /usr/local/lib/python3.10/site-packages (from graphrag) (3.9.1) 26.70 Requirement already satisfied: typing-extensions<5.0.0,>=4.12.2 in /usr/local/lib/python3.10/site-packages (from graphrag) (4.12.2) 26.88 Collecting scipy==1.12.0 26.91 Downloading scipy-1.12.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (34.8 MB) 57.56 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 34.8/34.8 MB 1.0 MB/s eta 0:00:00 57.63 Collecting tenacity<10.0.0,>=9.0.0 57.65 Downloading tenacity-9.0.0-py3-none-any.whl (28 kB) 57.72 Collecting json-repair<0.27.0,>=0.26.0 57.76 Downloading json_repair-0.26.0-py3-none-any.whl (12 kB) 58.14 Collecting pyaml-env<2.0.0,>=1.2.1 58.16 Downloading pyaml_env-1.2.1-py3-none-any.whl (9.0 kB) 58.26 Collecting textual<0.77.0,>=0.76.0 58.29 Downloading textual-0.76.0-py3-none-any.whl (567 kB) 58.67 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 567.2/567.2 kB 1.5 MB/s eta 0:00:00 59.16 Collecting numba==0.60.0 59.18 Downloading numba-0.60.0-cp310-cp310-manylinux2014_aarch64.manylinux_2_17_aarch64.whl (3.4 MB) 62.21 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.4/3.4 MB 1.1 MB/s eta 0:00:00 62.23 Requirement already satisfied: regex>=2021.8.3 in /usr/local/lib/python3.10/site-packages (from nltk==3.9.1->graphrag) (2024.7.24) 62.23 Requirement already satisfied: tqdm in /usr/local/lib/python3.10/site-packages (from nltk==3.9.1->graphrag) (4.66.5) 62.23 Requirement already satisfied: joblib in /usr/local/lib/python3.10/site-packages (from nltk==3.9.1->graphrag) (1.4.2) 62.23 Requirement already satisfied: click in /usr/local/lib/python3.10/site-packages (from nltk==3.9.1->graphrag) (8.1.7) 62.34 Collecting llvmlite<0.44,>=0.43.0dev0 62.36 Downloading llvmlite-0.43.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (42.9 MB) 87.90 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 42.9/42.9 MB 1.8 MB/s eta 0:00:00 88.05 Requirement already satisfied: tabulate in /usr/local/lib/python3.10/site-packages (from unstructured[all-docs]) (0.9.0) 88.05 Requirement already satisfied: lxml in /usr/local/lib/python3.10/site-packages (from unstructured[all-docs]) (5.3.0) 88.07 Collecting python-oxmsg 88.09 Downloading python_oxmsg-0.0.1-py3-none-any.whl (31 kB) 88.18 Collecting filetype 88.20 Downloading filetype-1.2.0-py2.py3-none-any.whl (19 kB) 88.20 Requirement already satisfied: dataclasses-json in /usr/local/lib/python3.10/site-packages (from unstructured[all-docs]) (0.6.7) 88.24 Collecting python-magic 88.25 Downloading python_magic-0.4.27-py2.py3-none-any.whl (13 kB) 88.30 Collecting unstructured-client 88.32 Downloading unstructured_client-0.25.6-py3-none-any.whl (45 kB) 88.33 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 45.1/45.1 kB 3.0 MB/s eta 0:00:00 88.36 Collecting langdetect 88.38 Downloading langdetect-1.0.9.tar.gz (981 kB) 88.90 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 981.5/981.5 kB 1.9 MB/s eta 0:00:00 88.95 Preparing metadata (setup.py): started 89.47 Preparing metadata (setup.py): finished with status 'done' 89.47 Requirement already satisfied: requests in /usr/local/lib/python3.10/site-packages (from unstructured[all-docs]) (2.32.3) 89.47 Requirement already satisfied: psutil in /usr/local/lib/python3.10/site-packages (from unstructured[all-docs]) (6.0.0) 89.47 Requirement already satisfied: backoff in /usr/local/lib/python3.10/site-packages (from unstructured[all-docs]) (2.2.1) 89.47 Requirement already satisfied: beautifulsoup4 in /usr/local/lib/python3.10/site-packages (from unstructured[all-docs]) (4.12.3) 89.47 Requirement already satisfied: wrapt in /usr/local/lib/python3.10/site-packages (from unstructured[all-docs]) (1.16.0) 90.43 Collecting rapidfuzz 90.45 Downloading rapidfuzz-3.9.6-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.6 MB) 91.47 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 1.6 MB/s eta 0:00:00 91.47 Requirement already satisfied: chardet in /usr/local/lib/python3.10/site-packages (from unstructured[all-docs]) (5.2.0) 91.51 Collecting python-iso639 91.54 Downloading python_iso639-2024.4.27-py3-none-any.whl (274 kB) 91.70 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 274.7/274.7 kB 1.6 MB/s eta 0:00:00 91.75 Collecting emoji 91.77 Downloading emoji-2.12.1-py3-none-any.whl (431 kB) 92.07 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 431.4/431.4 kB 1.4 MB/s eta 0:00:00 92.14 Collecting effdet 92.16 Downloading effdet-0.4.1-py3-none-any.whl (112 kB) 92.24 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 112.5/112.5 kB 1.6 MB/s eta 0:00:00 92.29 Collecting google-cloud-vision 92.32 Downloading google_cloud_vision-3.7.4-py2.py3-none-any.whl (467 kB) 92.55 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 467.5/467.5 kB 2.0 MB/s eta 0:00:00 92.56 Requirement already satisfied: openpyxl in /usr/local/lib/python3.10/site-packages (from unstructured[all-docs]) (3.1.5) 92.58 Collecting pdfminer.six 92.61 Downloading pdfminer.six-20240706-py3-none-any.whl (5.6 MB) 95.48 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.6/5.6 MB 2.0 MB/s eta 0:00:00 95.50 Collecting python-pptx>=1.0.1 95.52 Downloading python_pptx-1.0.2-py3-none-any.whl (472 kB) 95.70 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 472.8/472.8 kB 2.7 MB/s eta 0:00:00 95.82 Collecting pi-heif 96.07 Downloading pi_heif-0.18.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (932 kB) 96.48 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 932.3/932.3 kB 2.3 MB/s eta 0:00:00 96.53 Collecting pdf2image 96.55 Downloading pdf2image-1.17.0-py3-none-any.whl (11 kB) 96.60 Collecting unstructured.pytesseract>=0.3.12 96.62 Downloading unstructured.pytesseract-0.3.13-py3-none-any.whl (14 kB) 96.67 Collecting pypandoc 96.68 Downloading pypandoc-1.13-py3-none-any.whl (21 kB) 96.75 Collecting unstructured-inference==0.7.36 96.77 Downloading unstructured_inference-0.7.36-py3-none-any.whl (56 kB) 96.79 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 56.4/56.4 kB 3.0 MB/s eta 0:00:00 96.79 Requirement already satisfied: pypdf in /usr/local/lib/python3.10/site-packages (from unstructured[all-docs]) (4.2.0) 96.79 Requirement already satisfied: python-docx>=1.1.2 in /usr/local/lib/python3.10/site-packages (from unstructured[all-docs]) (1.1.2) 96.79 Requirement already satisfied: markdown in /usr/local/lib/python3.10/site-packages (from unstructured[all-docs]) (3.7) 96.88 Collecting onnx 96.89 Downloading onnx-1.16.2-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (15.8 MB) 106.3 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 15.8/15.8 MB 1.6 MB/s eta 0:00:00 106.4 Requirement already satisfied: pandas in /usr/local/lib/python3.10/site-packages (from unstructured[all-docs]) (2.2.2) 106.4 Collecting xlrd 106.4 Downloading xlrd-2.0.1-py2.py3-none-any.whl (96 kB) 106.5 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 96.5/96.5 kB 1.6 MB/s eta 0:00:00 107.3 Collecting pikepdf 107.3 Downloading pikepdf-9.2.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (2.3 MB) 109.1 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.3/2.3 MB 1.3 MB/s eta 0:00:00 109.2 Collecting layoutparser 109.2 Downloading layoutparser-0.3.4-py3-none-any.whl (19.2 MB) 118.3 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 19.2/19.2 MB 2.3 MB/s eta 0:00:00 118.3 Requirement already satisfied: onnxruntime>=1.17.0 in /usr/local/lib/python3.10/site-packages (from unstructured-inference==0.7.36->unstructured[all-docs]) (1.19.0) 118.3 Requirement already satisfied: huggingface-hub in /usr/local/lib/python3.10/site-packages (from unstructured-inference==0.7.36->unstructured[all-docs]) (0.24.6) 118.3 Collecting timm 118.4 Downloading timm-1.0.9-py3-none-any.whl (2.3 MB) 119.5 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.3/2.3 MB 2.0 MB/s eta 0:00:00 119.5 Requirement already satisfied: python-multipart in /usr/local/lib/python3.10/site-packages (from unstructured-inference==0.7.36->unstructured[all-docs]) (0.0.9) 119.6 Collecting torch 119.6 Downloading torch-2.4.0-cp310-cp310-manylinux2014_aarch64.whl (89.8 MB) 170.3 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 89.8/89.8 MB 1.3 MB/s eta 0:00:00 170.6 Collecting opencv-python!=4.7.0.68 170.6 Downloading opencv_python-4.10.0.84-cp37-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (41.7 MB) 201.6 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 41.7/41.7 MB 990.3 kB/s eta 0:00:00 201.6 Requirement already satisfied: matplotlib in /usr/local/lib/python3.10/site-packages (from unstructured-inference==0.7.36->unstructured[all-docs]) (3.9.2) 201.7 Collecting transformers>=4.25.1 201.7 Downloading transformers-4.44.2-py3-none-any.whl (9.5 MB) 209.5 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 9.5/9.5 MB 1.2 MB/s eta 0:00:00 209.5 Requirement already satisfied: azure-core>=1.23.0 in /usr/local/lib/python3.10/site-packages (from azure-identity<2.0.0,>=1.17.1->graphrag) (1.30.2) 209.9 Collecting cryptography>=2.5 209.9 Downloading cryptography-43.0.0-cp39-abi3-manylinux_2_28_aarch64.whl (3.8 MB) 216.1 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.8/3.8 MB 611.7 kB/s eta 0:00:00 216.2 Collecting msal-extensions>=0.3.0 216.3 Downloading msal_extensions-1.2.0-py3-none-any.whl (19 kB) 216.5 Collecting msal>=1.24.0 216.5 Downloading msal-1.30.0-py3-none-any.whl (111 kB) 216.7 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 111.8/111.8 kB 556.9 kB/s eta 0:00:00 216.7 Requirement already satisfied: isodate>=0.6.0 in /usr/local/lib/python3.10/site-packages (from azure-search-documents<12.0.0,>=11.4.0->graphrag) (0.6.1) 216.8 Collecting azure-common>=1.1 216.9 Downloading azure_common-1.1.28-py2.py3-none-any.whl (14 kB) 217.0 Collecting jsonschema<5.0.0,>=4.21.1 217.0 Downloading jsonschema-4.23.0-py3-none-any.whl (88 kB) 217.1 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 88.5/88.5 kB 790.5 kB/s eta 0:00:00 217.1 Requirement already satisfied: diskcache<6.0.0,>=5.6.3 in /usr/local/lib/python3.10/site-packages (from datashaper<0.0.50,>=0.0.49->graphrag) (5.6.3) 217.3 Collecting pyarrow<16.0.0,>=15.0.0 217.3 Downloading pyarrow-15.0.2-cp310-cp310-manylinux_2_28_aarch64.whl (35.7 MB) 242.1 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 35.7/35.7 MB 2.2 MB/s eta 0:00:00 242.1 Requirement already satisfied: pygments>=2.15.0 in /usr/local/lib/python3.10/site-packages (from devtools<0.13.0,>=0.12.2->graphrag) (2.18.0) 242.1 Requirement already satisfied: asttokens<3.0.0,>=2.0.0 in /usr/local/lib/python3.10/site-packages (from devtools<0.13.0,>=0.12.2->graphrag) (2.4.1) 242.1 Requirement already satisfied: executing>=1.1.1 in /usr/local/lib/python3.10/site-packages (from devtools<0.13.0,>=0.12.2->graphrag) (2.1.0) 242.1 Requirement already satisfied: marshmallow>=3.13.0 in /usr/local/lib/python3.10/site-packages (from environs<12.0.0,>=11.0.0->graphrag) (3.22.0) 242.1 Requirement already satisfied: packaging in /usr/local/lib/python3.10/site-packages (from fastparquet<2025.0.0,>=2024.2.0->graphrag) (23.2) 242.1 Requirement already satisfied: fsspec in /usr/local/lib/python3.10/site-packages (from fastparquet<2025.0.0,>=2024.2.0->graphrag) (2024.6.1) 242.3 Collecting cramjam>=2.3 242.3 Downloading cramjam-2.8.3-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.7 MB) 243.3 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.7/1.7 MB 1.8 MB/s eta 0:00:00 243.6 Collecting gensim<5.0.0,>=4.3.2 243.7 Downloading gensim-4.3.3-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (26.4 MB) 256.6 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 26.4/26.4 MB 2.9 MB/s eta 0:00:00 257.0 Collecting umap-learn<0.6.0,>=0.5.6 257.0 Downloading umap_learn-0.5.6-py3-none-any.whl (85 kB) 257.0 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 85.7/85.7 kB 3.5 MB/s eta 0:00:00 257.1 Collecting seaborn<0.14.0,>=0.13.2 257.1 Downloading seaborn-0.13.2-py3-none-any.whl (294 kB) 257.2 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 294.9/294.9 kB 4.4 MB/s eta 0:00:00 257.3 Collecting scikit-learn<2.0.0,>=1.4.2 257.4 Downloading scikit_learn-1.5.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (12.5 MB) 261.6 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12.5/12.5 MB 2.9 MB/s eta 0:00:00 262.0 Collecting graspologic-native<2.0.0,>=1.2.1 262.0 Downloading graspologic_native-1.2.1.tar.gz (2.5 MB) 262.8 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.5/2.5 MB 3.0 MB/s eta 0:00:00 262.9 Installing build dependencies: started 267.6 Installing build dependencies: finished with status 'done' 267.6 Getting requirements to build wheel: started 267.7 Getting requirements to build wheel: finished with status 'done' 267.7 Preparing metadata (pyproject.toml): started 267.8 Preparing metadata (pyproject.toml): finished with status 'error' 267.8 error: subprocess-exited-with-error 267.8 267.8 × Preparing metadata (pyproject.toml) did not run successfully. 267.8 │ exit code: 1 267.8 ╰─> [6 lines of output] 267.8 Checking for Rust toolchain.... 267.8 267.8 Cargo, the Rust package manager, is not installed or is not on PATH. 267.8 This package requires Rust and Cargo to compile extensions. Install it through 267.8 the system's package manager or via https://rustup.rs/ 267.8 267.8 [end of output] 267.8 267.8 note: This error originates from a subprocess, and is likely not a problem with pip. 267.8 error: metadata-generation-failed 267.8 267.8 × Encountered error while generating package metadata. 267.8 ╰─> See above for output. 267.8 267.8 note: This is an issue with the package mentioned above, not pip. 267.8 hint: See above for details. 267.8 267.8 [notice] A new release of pip is available: 23.0.1 -> 24.2 267.8 [notice] To update, run: pip install --upgrade pip Dockerfile:33 31 | RUN --mount=type=ssh pip install -e "libs/kotaemon[all]" 32 | RUN --mount=type=ssh pip install -e "libs/ktem" 33 | >>> RUN pip install graphrag future unstructured[all-docs] 34 | RUN pip install "pdfservices-sdk@git+https://github.com/niallcm/pdfservices-python-sdk.git@bump-and-unfreeze-requirements" 35 | ERROR: failed to solve: process "/bin/sh -c pip install graphrag future unstructured[all-docs]" did not complete successfully: exit code: 1

Note : I am trying docker build in MacOS Sonomo 14.6.1 M2 Pro

Reproduction steps

docker build -t acrregistry.azureacr.io/kotaemon:latest .

Screenshots

No response

Logs

No response

Browsers

No response

OS

MacOS

Additional information

No response

bkbas26 commented 1 week ago

Can someone help me with this?

cin-niko commented 1 week ago

@bkbas26 , can I see your Dockerfile? (the Dockerfile in the main branch doesn't install unstructured[all-docs]) And you can try this solution first: https://github.com/Cinnamon/kotaemon/pull/219/files

bkbas26 commented 1 week ago

@bkbas26 , can I see your Dockerfile? (the Dockerfile in the main branch doesn't install unstructured[all-docs]) And you can try this solution first: https://github.com/Cinnamon/kotaemon/pull/219/files

@cin-niko Thanks for your reply. Below is the Dockerfile that I had used to build.

FROM python:3.10-slim as base_image

RUN apt update -qqy \
  && apt install -y \
  ssh git \
  gcc g++ \
  poppler-utils \
  libpoppler-dev \
  tesseract-ocr \
  tesseract-ocr-jpn \
  libsm6 \
  libxext6 \
  ffmpeg \
  libmagic-dev \
  && \
  apt-get clean && \
  apt-get autoremove

ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1
ENV PYTHONIOENCODING=UTF-8

WORKDIR /app

FROM base_image as dev

COPY . /app
RUN --mount=type=ssh pip install -e "libs/kotaemon[all]"
RUN --mount=type=ssh pip install -e "libs/ktem"
RUN pip install graphrag future unstructured[all-docs]
RUN pip install "pdfservices-sdk@git+https://github.com/niallcm/pdfservices-python-sdk.git@bump-and-unfreeze-requirements"

EXPOSE 7860

ENTRYPOINT ["gradio", "app.py"]

Yes I included unstructured[all-docs] based on this PR (https://github.com/Cinnamon/kotaemon/pull/165/files). I tried with your solution of including

# Install Rust
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
ENV PATH="/root/.cargo/bin:${PATH}"

and still the same error. Below is the updated Dockerfile that I tried to build.

# syntax=docker/dockerfile:1.0.0-experimental
FROM python:3.10-slim as base_image

# for additional file parsers

# tesseract-ocr \
# tesseract-ocr-jpn \
# libsm6 \
# libxext6 \
# ffmpeg \

RUN apt-get update -qqy && \
    apt-get install -y --no-install-recommends \
      ssh \
      git \
      gcc \
      g++ \
      poppler-utils \
      libpoppler-dev \
    && apt-get clean \
    && apt-get autoremove \
    && rm -rf /var/lib/apt/lists/*

# Install Rust
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
ENV PATH="/root/.cargo/bin:${PATH}"

ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1
ENV PYTHONIOENCODING=UTF-8

WORKDIR /app

FROM base_image as dev

COPY . /app
RUN --mount=type=ssh pip install --no-cache-dir -e "libs/kotaemon[all]" \
    && pip install --no-cache-dir -e "libs/ktem" \
    && pip install --no-cache-dir graphrag future \
    && pip install --no-cache-dir "pdfservices-sdk@git+https://github.com/niallcm/pdfservices-python-sdk.git@bump-and-unfreeze-requirements"

ENTRYPOINT ["gradio", "app.py"]
ducminhle commented 1 week ago

Please check it, @bkbas26 https://github.com/Cinnamon/kotaemon/pull/219#discussion_r1744235014

cin-niko commented 1 week ago

@bkbas26 , I checked it. Please add cargo to apt-get install line in your Dockerfile and try building Docker again.

# syntax=docker/dockerfile:1.0.0-experimental
FROM python:3.10-slim as base_image

# for additional file parsers

# tesseract-ocr \
# tesseract-ocr-jpn \
# libsm6 \
# libxext6 \
# ffmpeg \

RUN apt-get update -qqy && \
    apt-get install -y --no-install-recommends \
      ssh \
      git \
      gcc \
      g++ \
      poppler-utils \
      libpoppler-dev \
      cargo \
    && apt-get clean \
    && apt-get autoremove \
    && rm -rf /var/lib/apt/lists/*

ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1
ENV PYTHONIOENCODING=UTF-8

WORKDIR /app

FROM base_image as dev

COPY . /app
RUN --mount=type=ssh pip install --no-cache-dir -e "libs/kotaemon[all]" \
    && pip install --no-cache-dir -e "libs/ktem" \
    && pip install --no-cache-dir graphrag future unstructured[all-docs] \
    && pip install --no-cache-dir "pdfservices-sdk@git+https://github.com/niallcm/pdfservices-python-sdk.git@bump-and-unfreeze-requirements"

ENTRYPOINT ["gradio", "app.py"]