possee-org / genai-numpy

MIT License
4 stars 6 forks source link

An Example of Few Shot Prompting to generate docstrings with GPT4 #16

Closed bmwoodruff closed 3 months ago

bmwoodruff commented 4 months ago

While experimenting with prompt engineering, to reduce costs I'll be using GPT4 to design things before I use Nebari.

As there are already lots of functions in NumPy with docstrings, we can use few-shot prompting to train AI how to provide appropriate NumPy docstrings. We can then generate docstrings for functions that already have them, so that we can compare the generated output with what's currently in the codebase. This is an experiment with the ideas provided at https://github.com/possee-org/possee-resources/issues/1. We could then use this to produce docstrings for functions that do not currently have them.

First, we need a tool that will extract a function from a file and separate the function from its docstring. Then we iterate over this function to generate a few-shot prompt which we can copy/paste into GPT-4 (or algorithmically provide to Llama3 using the API in Nebari). I've created a Jupyter notebook to do this.

The generated docstrings were decent, included examples, and had the proper formatting. Things look promising.

bmwoodruff commented 4 months ago

Here's some screenshots of the results. The third option had an extra newline at the bottom of the prompt, which might be the reason the function code was included as well.

Screenshot 2024-05-10 125914 Screenshot 2024-05-10 125933 Screenshot 2024-05-10 125943 Screenshot 2024-05-10 125953