Closed drunkwcodes closed 2 months ago
Hey @drunkwcodes! I'm here to help you with any bugs, questions, or contributions you have. Let's tackle this issue together!
To resolve the "ModuleNotFoundError: No module named 'pandas.compat'" error when using SimpleDirectoryReader
with MarkdownReader
, you can configure SimpleDirectoryReader
to exclude readers that depend on pandas
. Here's how you can do it:
from llama_index.core.readers.file.base import SimpleDirectoryReader
from llama_index.readers.file.markdown import MarkdownReader
# Define a custom file_extractor that excludes Pandas readers
custom_file_extractor = {
".md": MarkdownReader,
# Add other readers as needed, excluding those that require pandas
}
# Initialize SimpleDirectoryReader with the custom file_extractor
reader = SimpleDirectoryReader(
input_files=["/home/drunkwcodes/projects/myfoam/kb/package_manager/pdm.md"],
file_extractor=custom_file_extractor
)
# Use the reader to read files
documents = reader.load_data()
This configuration ensures that SimpleDirectoryReader
uses MarkdownReader
for .md
files without requiring pandas
[1].
To continue talking to Dosu, mention @dosu.
Not working.
ImportError Traceback (most recent call last)
File ~/projects/bb7/.venv/lib/python3.12/site-packages/llama_index/core/readers/file/base.py:52, in _try_loading_included_file_formats()
[51](https://file+.vscode-resource.vscode-cdn.net/home/drunkwcodes/projects/bb7/src/bb7/~/projects/bb7/.venv/lib/python3.12/site-packages/llama_index/core/readers/file/base.py:51) try:
---> [52](https://file+.vscode-resource.vscode-cdn.net/home/drunkwcodes/projects/bb7/src/bb7/~/projects/bb7/.venv/lib/python3.12/site-packages/llama_index/core/readers/file/base.py:52) from llama_index.readers.file import (
[53](https://file+.vscode-resource.vscode-cdn.net/home/drunkwcodes/projects/bb7/src/bb7/~/projects/bb7/.venv/lib/python3.12/site-packages/llama_index/core/readers/file/base.py:53) DocxReader,
[54](https://file+.vscode-resource.vscode-cdn.net/home/drunkwcodes/projects/bb7/src/bb7/~/projects/bb7/.venv/lib/python3.12/site-packages/llama_index/core/readers/file/base.py:54) EpubReader,
[55](https://file+.vscode-resource.vscode-cdn.net/home/drunkwcodes/projects/bb7/src/bb7/~/projects/bb7/.venv/lib/python3.12/site-packages/llama_index/core/readers/file/base.py:55) HWPReader,
[56](https://file+.vscode-resource.vscode-cdn.net/home/drunkwcodes/projects/bb7/src/bb7/~/projects/bb7/.venv/lib/python3.12/site-packages/llama_index/core/readers/file/base.py:56) ImageReader,
[57](https://file+.vscode-resource.vscode-cdn.net/home/drunkwcodes/projects/bb7/src/bb7/~/projects/bb7/.venv/lib/python3.12/site-packages/llama_index/core/readers/file/base.py:57) IPYNBReader,
[58](https://file+.vscode-resource.vscode-cdn.net/home/drunkwcodes/projects/bb7/src/bb7/~/projects/bb7/.venv/lib/python3.12/site-packages/llama_index/core/readers/file/base.py:58) MarkdownReader,
[59](https://file+.vscode-resource.vscode-cdn.net/home/drunkwcodes/projects/bb7/src/bb7/~/projects/bb7/.venv/lib/python3.12/site-packages/llama_index/core/readers/file/base.py:59) MboxReader,
...
[89](https://file+.vscode-resource.vscode-cdn.net/home/drunkwcodes/projects/bb7/src/bb7/~/projects/bb7/.venv/lib/python3.12/site-packages/llama_index/core/readers/file/base.py:89) ".xlsx": PandasExcelReader,
[90](https://file+.vscode-resource.vscode-cdn.net/home/drunkwcodes/projects/bb7/src/bb7/~/projects/bb7/.venv/lib/python3.12/site-packages/llama_index/core/readers/file/base.py:90) }
[91](https://file+.vscode-resource.vscode-cdn.net/home/drunkwcodes/projects/bb7/src/bb7/~/projects/bb7/.venv/lib/python3.12/site-packages/llama_index/core/readers/file/base.py:91) return default_file_reader_cls
ImportError: `llama-index-readers-file` package not found
I re-installed the whole venv and resolve the issue.
Bug Description
I was trying to do a simple markdown RAG:
But the result is still in vain:
ModuleNotFoundError: No module named 'pandas.compat'
The c extensions is not built by installing from pypi.
Please purge the pandas dependency for the majority of people who want plain text reading.
Version
0.11.14
Steps to Reproduce
pdm add these:
"llama-index>=0.11.14", "llama-index-embeddings-huggingface>=0.3.1", "llama-index-readers-pdf-marker>=0.2.0", "llama-parse>=0.5.6", "nest-asyncio>=1.6.0", "marker-pdf>=0.2.17", "llama-index-readers-file>=0.2.2", "pandas[excel]>=2.2.3",
and run this script:
Relevant Logs/Tracbacks
No response