UpgrAIder is a tool for automatically updating outdated code snippets (specifically those that use deprecated library APIs). The underlying technique relies on the usage of a Large Language Model (hence the "AI" in the name), augmented with information retrieved from release notes. More details about the project can be found in this presentation.
Note that UpgrAIder represents an early exploration of the above technique, and has been made available in open source as a basis for research and exploration.
git clone <this repo>
Install dependencies:
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
python setup.py develop
Create environment variables
.env file
(SCRATCH_VENV
) .env
file to hold these environment variables:cat > .env <<EOL
OPENAI_API_KEY=...
OPENAI_ORG=...
SCRATCH_VENV=<absolute path to a folder that already has a venv we can activate>
To populate the database with the information of the available release notes for each library, run python src/upgraider/populate_doc_db.py
Note that this is a one time step (unless you add libraries or release notes). The libraries
folder contains information for all current target libraries, including the code examples we evaluate on. Each library folder contains a library.json
file that specifies the base version, which is the library version available around the training date of the model (~ May 2022) and the current version of the library. The base version is useful to know which release notes to consider (those after that date) while the current version is useful since this is the one we want to use for our experiments.
Right now, each library folder already contains the release notes between the base and current library version. These were manually retrieved; in the future, it would be useful to create a script that automatically retrieves release notes for a given library.
The above script looks for sections with certain keywords related to APIs and/or deprecation. It then creates a DB entry which has an embedding for the content of each item in those sections.
src/upgraider/fix_code_examples.py
is the file responsible for this. Run python src/upgraider/fix_lib_examples.py --help
to see the required command lines. To run a single example, make sure to specify --examplefile
; otherwise, it will run on all the examples available for that library.
Run python src/upgraider/run_experiment.py --outputDir <absolute path of output folder>
This will attempt to run upgraider on all code examples avaiable for all libraries in the libraries
folder. The output data and reports will be written to outputDir
.
To create a markdown report summarizing the results, use the src/benchmark/parse_results.py
script while passing the output directory you wrote results to above. For example python src/benchmark/parse_reports.py --outputdir output/
.
The run_experiment
workflow allows you to run a full experiment on the available libraries. It produces a markdown report of the results. Note that you need to configure your repository with two repository secrets OPENAI_API_KEY
and OPENAI_ORG
.
python -m pytest
Experimental/not current used any more: To find differences between two versions of an API, you can run
python src/apiexploration/run_api_diff.py
which will use the library version info in the libraries
folders.
This project is licenses under the terms of the MIT open source license. Pleare refer to MIT for the full terms.
UpgrAIder is a research prototype and is not officially supported. However, if you have questions or feedback, please file an issue and we will do our best to respond.