run-llama / llama-hub

A library of data loaders for LLMs made by the community -- to be used with LlamaIndex and/or LangChain
https://llamahub.ai/
MIT License
3.45k stars 732 forks source link

Maddataanalyst/add docstring walker #694

Closed maddataanalyst closed 11 months ago

maddataanalyst commented 11 months ago

Description

Added new loader - DocstringWalker, that builds documents from docstrings in Python modules, using AST library. The main motivation is to give users an ability to create code analysis tool, that will help in e.g., building documentation or automatic code-buddy, but do not use tokens to read the code itself. If the library/module in question is properly documented with docstrings: it should be helpful enough.

Fixes # (issue)

Type of Change

Please delete options that are not relevant.

How Has This Been Tested?

New tests were added for DocstringWalker - using standard PyTest setup for Llama-hub. Tests cover a couple of scenarios including malformed modules reading, failing fast or reading multiple documents. Tests are part of standard  poetry run pytest tests

Suggested Checklist:

jerryjliu commented 11 months ago

this is cool! reviewing now

maddataanalyst commented 11 months ago

i love this

to make this pop, any chance you're willing to contribute an .ipynb notebook? can add it to the same directory as docstring_walker, and can just use the code example you provided in the README.md. This would allow users to easily try it out themselves on their own data.

(you can also add a "Open in Colab" tag in the notebook that will directly open the notebook up in a Colab environment for one-click testing e.g. https://github.com/run-llama/llama_index/blob/main/docs/examples/llm/ai21.ipynb)

Hello! Thank you so much for the feedback. I was wrapping my head around this idea for a while, finally got some time to implement it properly. When it comes to notebook - sure thing, I will add proper implementation in a day or two. I have some quick'n'dirty notebooks for testing this, will clean this up a little and add to this PR close to the end of the week. 

review-notebook-app[bot] commented 11 months ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

maddataanalyst commented 11 months ago

to make this pop, any chance you're willing to contribute an .ipynb notebook? can add it to the same directory as docstring_walker, and can just use the code example you provided in the README.md. This would allow users to easily try it out themselves on their own data.

Done, notebook has been added. Thanks a lot for the idea!

jerryjliu commented 11 months ago

the one comment is i'd make the URL in the colab notebook the following

https://colab.research.google.com/github/run-llama/llama-hub/blob/main/llama_hub/docstring_walker/docstringwalker_example.ipynb

maddataanalyst commented 11 months ago

the one comment is i'd make the URL in the colab notebook the following

https://colab.research.google.com/github/run-llama/llama-hub/blob/main/llama_hub/docstring_walker/docstringwalker_example.ipynb

Done, I have added the link that you provided. Previously Colab automatically added link to my repo/my current notebook, I didn't notice that sorry.

maddataanalyst commented 11 months ago

Apart from that - I have fixed some minor errors reported after running black --check .

Should be fine now :)

jerryjliu commented 11 months ago

merged!