While APIs have become a pervasive component of software, a core challenge for developers is to identify and use existing APIs. This warrants either a deep understanding of the API landscape or access to high-quality documentation and usage examples. While the for- mer is infeasible, the latter is often limited in practice.
CodeScholar
(📝 Paper: Preprint) is a tool that generates idiomatic code examples for
query APIs (single and multiple). It finds idiomatic examples for APIS by searching a large
corpus of code and growing program graphs idiomatically guided by a neural model.
python search.py --dataset <dataset_name> --seed json.load
# clone the repository
git clone git@github.com:tart-proj/codescholar.git
# cd into the codescholar directory
cd codescholar
# install basic requirements
pip install -r requirements-dev.txt
# install pytorch-geometric requirements. Use {pyg} for GPU and {torch} for CPU
pip install -r requirements-{pyg,torch}.txt
# install codescholar
pip install -e .
Starting services
./services.sh start
Indexing
./services.sh index <dataset_name>
Searching
# run the codescholar query (say np.mean) using /search/search.py
python search.py --dataset <dataset_name> --seed np.mean
You can also use some arguments with the search query:
--min_idiom_size <int> # minimum size of idioms to be saved
--max_idiom_size <int> # maximum size of idioms to be saved
--max_init_beams <int> # maximum beams to initialize search
--stop_at_equilibrium # stop search when diversity = reusability of idioms
note: see more configurations in /search/search_config.py
Setup services
./services.sh start
./services.sh index <dataset_name>
Start server and application
cd codescholar/apps
./app.sh start
View details about the app using: ./app.sh show
Refer to the training README for a detailed description of how to train CodeScholar.
Refer to the evaluation README for a detailed description of how to reproduce the evaluation results reported in the paper.