This repository contains the data and inference code of the NeurIPS 2023 (Datasets and Benchmarks track) paper "CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code Completion."
tar -xvJf data/crosscodeeval_data.tar.xz -C data/
pip install -r requirements.txt
bash scripts/build_treesitter.sh
Our evaluation consists of two steps: generation and metrics calculation.
For open-sourced models like StarCoder, DeepSeek-Coder, etc., we recommended using vLLM for fast and distributed inference on CrossCodeEval.
export gpus=2
export model=bigcode/starcoder2-3b
export language=python
export task=line_completion_rg1_unixcoder_cosine_sim
export output_dir=./tmp/crosscodeeval_testrun/
python scripts/vllm_inference.py \
--tp $gpus \
--task $task \
--language $language \
--model $model \
--output_dir $output_dir \
--use_crossfile_context
For additional args, e.g., cross-file context length and sampling top_p, please see python vllm_inference.py --help
.
OpenAI models are accessible through an API. You may use the following script:
export model=gpt-3.5-turbo-0125
export language=python
export task=line_completion_rg1_unixcoder_cosine_sim
export output_dir=./tmp/crosscodeeval_openai_testrun/
python scripts/openai_inference.py \
--task $task \
--language $language \
--model $model \
--output_dir $output_dir \
--use_crossfile_context
After obtaining the generation, we can calculate the final metrics
export language=python
export ts_lib=./build/${language}-lang-parser.so;
export task=line_completion_oracle_unixcoder_cosine_sim
export prompt_file=./data/${language}/${task}.jsonl
export output_dir=./tmp/crosscodeeval_testrun/;
python scripts/eval.py \
--prompt_file $prompt_file \
--output_dir $output_dir \
--ts_lib $ts_lib \
--language $language \
--only_compute_metric
@inproceedings{ding2023crosscodeeval,
title={CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code Completion},
author={Yangruibo Ding and Zijian Wang and Wasi Uddin Ahmad and Hantian Ding and Ming Tan and Nihal Jain and Murali Krishna Ramanathan and Ramesh Nallapati and Parminder Bhatia and Dan Roth and Bing Xiang},
year={2023},
booktitle={Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
url={https://arxiv.org/pdf/2310.11248.pdf}
}
Please feel free to email us (email addresses in the paper). You may also submit an issue in this repo.
See CONTRIBUTING for more information.
This project is licensed under the Apache-2.0 License.