run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
36.55k stars 5.23k forks source link

[Question]: Does SimpleDirectoryReader intentionally not support Path for the input_dir arguement? #16030

Closed andrewwkimm closed 1 month ago

andrewwkimm commented 1 month ago

Question Validation

Question

I've noticed that input_dir only takes str, not Path, as an arguement and wanted to confirm if this is intentional. The expected behavior was that if I pass in a pathlib object, load_data() should work. e.g.,:

dir_path = Path("workspaces/some_dir")
documents = SimpleDirectoryReader(dir_path).load_data()

Happy to take this on myself—let me know if this is a worthwhile task and I can get right to it!

dosubot[bot] commented 1 month ago

Hey @andrewwkimm! I'm here to help you with any bugs, questions, or contributions you have in mind. Just give me a shout, and I'll be right with you.

Yes, SimpleDirectoryReader intentionally does not support Path for the input_dir argument. The input_dir argument is explicitly defined as a string (input_dir: Optional[str] = None), and the code checks if the provided input_dir is a valid directory using self.fs.isdir(input_dir). This indicates that the class expects input_dir to be a string representing the path to the directory [1][2].

If you are interested in modifying SimpleDirectoryReader to accept Path objects, it could be a worthwhile task. However, keep in mind the potential implications such as compatibility issues, type handling, and the need to update existing tests [1][2][3][4].

To continue talking to Dosu, mention @dosu.