Closed bmwoodruff closed 1 month ago
I'm thinking a report may not be needed. Just inject the code into the proper place in the codebase, while working on a branch, and let VS Code's side-by-side Souce Control be the report.
I'm working on building an example extractor to take out the new examples, and only the new examples, from the generated files. I figured I'd report a bit on the intermediate progress.
examples/swap_issues.txt
when the PR is uploaded).
import numpy as numpy
. There are multiple spots in the codebase where this was not followed. emath
to lib.scimath
in a lot of files, changing the original docstring. I'm guessing this is because the two functions share the same docstring, and hence no changes should be made. .
and _
sometimes. It also does not like all caps when coupled with .
and or _
. That's enough of a report for now. I wanted to keep track of what I'm doing. I think i have enough written to automate direct example injection into the codebase for the 590 functions (followed then by human review). We can do this one module at a time. I want to polish the scripts up first that will do the post processing (the functions are currently in examples/extract_new_exmaples.py
). Once I get things polished up, I'll add docstrings, get rid of my silly debugging hacks, and then hopefully we can use them to inject a thousand or more examples into the codebase.
I got excited and wanted to share. I'm going to use the fuzzywuzzy
package to do text comparisons.
Well, I got a lot of progress.
fuzzywuzzy
did not quite match things (I saw one at 66%), and so lots of repeated text got placed at the end of the examples section. I'm considering lowering the threshold to 65% instead of 70, but I need a metric in place to measure output length changes before I do this. We have code now that does the following:
rng = np.random.default_rng()
). I think those are the key bits for an automated workflow. I'll work on polishing it up tomorrow.
I'm going to work on " Algorithmically locate the proper spot in numpy codebase to insert examples, and then insert examples." next.
Thoughts:
docstring = eval(module + '.' + func + '.__doc__')
, same as in prompt_generator.py
. prompt_examples
with cleaned_output_examples
. If this fails (prompt_examples text not found), then that hopefully means the docstring was updated via a pull request since generating the examples. This should impact very few functions. Test this first on a bunch of docstrings and generate a success report.search_and_replace_phrase
in example_post_processing.py
to test insertion of examples into an entire module. Use trackinglists/log
files to automate this for a module. The testing function which does the insertion needs to generate a list that includes success/fail and which file was changed.I think sending in 1000+ examples to be reviewed at once will not be wanted, but I think tackling an entire module (with the exception of ma
and np
) could be desirable. Then if at some point the devs want to remove the AI generated examples (maybe legal issues will hit us all), then it can be done easily.
A consistent commit message would be nice. Here is a proposed option adapted from a discussion with @otieno-juma:
DOC: AI-Gen examples for ...
Examples created by Llama3-70B. Reviewed and modified as part of POSSEE.
Co-authored-by: Ben Woodruff <bmwoodruff@gmail.com>
[skip actions] [skip azp] [skip cirrus]
Not sure if adding my name to all of them is needed. My thoughts are that this would provide a BLAME trail that includes me in addition to the interns, if someone wants more information.
PRs #98, #100, and #101 are all related to this task. #101 took forever, as I could not figure out why escape characters were being removed. I'll record what I learned here, as all solutions from AI were garbage (and hopefully it will find this content when trained in the future).
You can see the problem with disappearing escape sequences with the following minimal example.
content = 'String with \n new lines and \\\\ some \\ backslashes.\n We need a few more \\\\ to help \\ see the problem.'
old_phrase = 'String with \n new lines and \\\\ some \\ backslashes.'
new_phrase = 'String with \n new lines and \\\\ some \\ backslashes.'
pattern = re.compile(re.escape(old_phrase), re.MULTILINE)
new_content = pattern.sub(new_phrase, content)
new_content
The output string is below, and you can clearly see how the replaced content now has lost half the escape characters.
'String with \n new lines and \\ some \\ backslashes.\n We need a few more \\\\ to help \\ see the problem.'
To fix this, just replace all \\
with \\\\
in new_phrase
before using pattern.sub
. The updated code is
content = 'String with \n new lines and \\\\ some \\ backslashes.\n We need a few more \\\\ to help \\ see the problem.'
old_phrase = 'String with \n new lines and \\\\ some \\ backslashes.'
new_phrase = 'String with \n new lines and \\\\ some \\ backslashes.'
pattern = re.compile(re.escape(old_phrase), re.MULTILINE)
new_content = pattern.sub(new_phrase.replace('\\','\\\\'), content)
new_content
The output is now the correct string:
'String with \n new lines and \\\\ some \\ backslashes.\n We need a few more \\\\ to help \\ see the problem.'
Wasted almost a day on this ...
What's left?
These I think are crucial before it's ready to use:
linalg.svd
. Gotta track this down. Hopefully easy.These are polishing updates to work on after we get something working:
spin lint
captures these, and it can be manually adjusted. Fixing this can be done with black
for generated code, textwrap
for generated text, but I'm still not sure about generated output. Using spin lint
is simple for now.PRs #102, #103, #104 are also connected to this task.
As I'm wrapping up automation, I have some new thoughts.
Right now we have examples, lots of them, that we could add into the code base. Whether we add them or not was not the goal for this project, rather our goal was to get a Proof of Concept showing how it can be done. We can do it now.
Rather than inject these examples into the codebase, why not wait. We might as well inject examples for the functions that are missing examples, but we can postpone mass example inclusion for now. We have a POC. Now we can refine the prompt(s) and see if there is buy in from the maintainers.
I do think we should go through and clean a few modules up, with branches fully ready to be included in the main namespace, just so we know how much time that will take and we can showcase an example of the whole process. My thoughts there are:
This means that each branch would have 2 commits. The AI gen commits will most likely not pass tests.
I think it's better to not push to a branch instantly, rather not commit the changes so that it's simple to see what's been changed. I'll close this as done.
Description:
With the generated example logs created (almost 1000), we need a way to automate processing them. This entails multiple things.
Note that some examples suggest an idea for an example, but the actual examples is garbage. For these examples, I'd like a special note that suggests human intervention to preserve the idea.
In other places, the way to call a function has preferred settings. For example,
rng = np.random.default_rng()
and then going from there.Acceptance Criteria:
tools/review/reviewtools.py
).Algorithmically create a report that easily shows old/new for quick tech lead review.