Open bmwoodruff opened 1 month ago
If we have something generate examples, I was thinking it might be nice to have it generate LOTS of example (not just 1 or 2). Then we could ask it to rank the examples, providing reasons for the ranking, and eventually ask us to select the ones to include. If we start with 10 or so examples, and reduce to the best 3, then perhaps those get included in a PR which allows the maintainers to reduce it to 1 or 2. This will require more feedback from the maintainers.
First, we need to produce something, so I think focusing on adding a single example to a single method, all done algorithmically, would be a great step.
I'll take a look at 1 and 5 to see what I can do there.
I'll work on 2
@andrewtggreene I'm thinking of a different way to generate examples. I don't think we have to generate both the example, and the output. Instead, we generate a bunch of examples. Then these examples are fed into numpy directly, saving the output. Then we combine these together.
The point would then be to generate a variety of examples, and have AI help create examples that utilize different inputs. By not having it try to generate the solution (which would be pointless for some of the newer functions), we save time.
In addition, as I've played with this, I'm thinking that we need to provide not just the function, but also the full function name. There is no way that AI will know it has to access numpy.linalg.svdvals
if I just provide it the function definition. I think it's fair to provide the full name.
@bmwoodruff I agree. After reviewing the article on unit testing I was thinking of running an initial prompt using fewshot prompting and then running the examples that the AI creates and crafting a new prompt based off of those results, but I think just having the initial examples run and adding the result to the prompt would be more efficient.
I also ran into some issues with this with needing the AI to know where the files are located. I was able to craft a notebook that could run some of the prompts that Llama3 was producing but most had errors running due to either not knowing how to access the function or due to examples using the same variable throught the examples. I think it might be helpful to add a catch for code that doesn't run properly and asks the AI to evaluate fixes.
I was able to get a fairly decent script that ran very well on the 70B model which produces some very workable responses. I tried to include everything in a function to allow for us to create an object that can produce the examples for us. I'll be adding the script to this repo tomorrow. I think it's a pretty decent place to start.
@bmwoodruff I think using a modular design for this class structure would be more systematic. Based on my research i was able to obtain this structure that we can use to guide us as we engineer the script I have a sample script already generated by chatGPT4 that i'm evaluating and getting ideas on how it executes the entire process.
Here's a step-by-step approach:
Class Design: Extractor: Extracts and identifies functions needing examples or docstrings. Injector: Injects the content into the identified places in the code. PRCreator: Creates a pull request with the changes. Manager: Manages the overall process.
Here is the script: import re import os import inspect
class Extractor: def init(self, module): self.module = module
def find_missing_examples(self):
missing_examples = []
for name, obj in inspect.getmembers(self.module):
if inspect.isfunction(obj) or inspect.isclass(obj):
docstring = inspect.getdoc(obj)
if docstring:
example_present = re.search(r'Examples?\n-+\n', docstring)
if not example_present:
missing_examples.append((name, obj))
return missing_examples
class Injector: def init(self, code_dir): self.code_dir = code_dir
def inject_example(self, obj, example_text):
source_file = inspect.getfile(obj)
with open(source_file, 'r') as file:
code = file.readlines()
obj_name = obj.__name__
start_line = 0
for i, line in enumerate(code):
if re.search(f'def {obj_name}\(', line) or re.search(f'class {obj_name}\(', line):
start_line = i
break
indent = ' ' * (len(line) - len(line.lstrip()))
example_section = f'\n{indent}Examples\n{indent}--------\n{example_text}\n'
docstring_start = start_line
while docstring_start < len(code) and '"""' not in code[docstring_start]:
docstring_start += 1
docstring_end = docstring_start + 1
while docstring_end < len(code) and '"""' not in code[docstring_end]:
docstring_end += 1
code.insert(docstring_end, example_section)
with open(source_file, 'w') as file:
file.writelines(code)
def add_examples(self, missing_examples):
example_text = """
Example usage:
>>> import numpy as np
>>> x = np.array([1, 2, 3])
>>> np.sum(x)
6
"""
for name, obj in missing_examples:
self.inject_example(obj, example_text)
class PRCreator: def init(self, repo_name, branch_name, commit_message, pr_title, pr_body, token): self.repo_name = repo_name self.branch_name = branch_name self.commit_message = commit_message self.pr_title = pr_title self.pr_body = pr_body self.token = token
def create_pr(self):
g = Github(self.token)
try:
repo = g.get_repo(self.repo_name)
repo.create_git_ref(ref=f"refs/heads/{self.branch_name}", sha=repo.get_branch("main").commit.sha)
contents = repo.get_contents("")
for content_file in contents:
repo.create_file(content_file.path, self.commit_message, content_file.decoded_content, branch=self.branch_name)
pr = repo.create_pull(title=self.pr_title, body=self.pr_body, head=self.branch_name, base="main")
print(f"Pull request created: {pr.html_url}")
except GithubException as e:
print(f"Failed to create PR: {e}")
class Manager: def init(self, module, code_dir, repo_name, branch_name, commit_message, pr_title, pr_body, token): self.extractor = Extractor(module) self.injector = Injector(code_dir) self.pr_creator = PRCreator(repo_name, branch_name, commit_message, pr_title, pr_body, token)
def run(self):
missing_examples = self.extractor.find_missing_examples()
if missing_examples:
self.injector.add_examples(missing_examples)
self.pr_creator.create_pr()
else:
print("No missing examples found.")
if name == "main": import numpy as np
MODULE = np
CODE_DIR = "/path/to/numpy/code" # Adjust to your local path
REPO_NAME = "your-username/your-repo"
BRANCH_NAME = "add-examples"
COMMIT_MESSAGE = "Add examples to missing docstrings"
PR_TITLE = "Add examples to missing docstrings"
PR_BODY = "This PR adds examples to the missing docstrings in the code."
TOKEN = "your-github-token"
manager = Manager(MODULE, CODE_DIR, REPO_NAME, BRANCH_NAME, COMMIT_MESSAGE, PR_TITLE, PR_BODY, TOKEN)
manager.run()
@bmwoodruff I took some time to find a way to mask the token variable used to authenticate with the GitHub API so far I have not figured out a way to hide the token. To use the PRCreator class, you need to provide a valid GitHub token when creating an instance of it. Therefore if we decide to revoke the use of a token variable then we will not have a class working that can automate the Pull requests.
@otieno-juma we usually store the token in environment variables. We can access those variables with the os.environ function. Eg:
# my_script.py
import os
TOKEN = os.environ['GH_TOKEN']
Then:
# In the terminal running your script
export GH_TOKEN=<your token>
python my_script.py
You have to call this export
only once, then all subsequent commands will already see the variable.
To have this working in a CI/CD pipeline it's necessary to create a secret
. Creating and managing secrets is very easy with GitHub Actions.
@luxedo is it possible to arrange a brief sync with you this coming week to discuss this?
I'd love help brainstorming and organizing a class structure for creating PRs that inject either docstrings or examples into existing code. Here are some of the things that need to happen. I'll focus on example creation in the ideas below. A similar modular design would exist for docstrings.