snadi / UpgrAIder

Upgrade deprecated/outdated code using LLMs and release notes
MIT License
1 stars 0 forks source link

UpgrAIder

UpgrAIder is a tool for automatically updating outdated code snippets (specifically those that use deprecated library APIs). The underlying technique relies on the usage of a Large Language Model (hence the "AI" in the name), augmented with information retrieved from release notes. More details about the project can be found in this presentation.

Note that UpgrAIder represents an early exploration of the above technique, and has been made available in open source as a basis for research and exploration.

Setup

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
python setup.py develop

Running

Populating the DB

To populate the database with the information of the available release notes for each library, run python src/upgraider/populate_doc_db.py

Note that this is a one time step (unless you add libraries or release notes). The libraries folder contains information for all current target libraries, including the code examples we evaluate on. Each library folder contains a library.json file that specifies the base version, which is the library version available around the training date of the model (~ May 2022) and the current version of the library. The base version is useful to know which release notes to consider (those after that date) while the current version is useful since this is the one we want to use for our experiments.

Right now, each library folder already contains the release notes between the base and current library version. These were manually retrieved; in the future, it would be useful to create a script that automatically retrieves release notes for a given library.

The above script looks for sections with certain keywords related to APIs and/or deprecation. It then creates a DB entry which has an embedding for the content of each item in those sections.

Updating a single code example

src/upgraider/fix_code_examples.py is the file responsible for this. Run python src/upgraider/fix_lib_examples.py --help to see the required command lines. To run a single example, make sure to specify --examplefile; otherwise, it will run on all the examples available for that library.

Running a full experiment

Run python src/upgraider/run_experiment.py --outputDir <absolute path of output folder> This will attempt to run upgraider on all code examples avaiable for all libraries in the libraries folder. The output data and reports will be written to outputDir.

To create a markdown report summarizing the results, use the src/benchmark/parse_results.py script while passing the output directory you wrote results to above. For example python src/benchmark/parse_reports.py --outputdir output/.

Using GitHub Actions to run experiments

The run_experiment workflow allows you to run a full experiment on the available libraries. It produces a markdown report of the results. Note that you need to configure your repository with two repository secrets OPENAI_API_KEY and OPENAI_ORG.

Running Tests

python -m pytest

Extra Functionality

Experimental/not current used any more: To find differences between two versions of an API, you can run

python src/apiexploration/run_api_diff.py

which will use the library version info in the libraries folders.

License

This project is licenses under the terms of the MIT open source license. Pleare refer to MIT for the full terms.

Maintainers

Support

UpgrAIder is a research prototype and is not officially supported. However, if you have questions or feedback, please file an issue and we will do our best to respond.