issues
search
bigcode-project
/
bigcode-evaluation-harness
A framework for the evaluation of autoregressive code generation language models.
Apache License 2.0
832
stars
220
forks
source link
Multilingual evaluation benchmarks
#1
Closed
loubnabnl
closed
2 years ago
loubnabnl
commented
2 years ago
Added:
Code generation (few-shot generation with BLEU evaluation): Concode (Java ), Spider (SQL), CoNaLa(Python)
Code summarization (few-shot generation with BLEU evaluation): code-to-text benchmark from CodeXGLUE (Java, JavaScript, PHP,Python, Go and Ruby)
Classification tasks (fine-tuning): complexity prediction (Java), clone detection (Java), defect detection(C)
Added: