bigcode-project / bigcodebench

BigCodeBench: Benchmarking Code Generation Towards AGI
https://bigcode-bench.github.io/
Apache License 2.0
177 stars 20 forks source link

Problem Categories #31

Open normster opened 1 month ago

normster commented 1 month ago

Hi,

In the report some GPT-4 annotated categories of problems were shown. Would it be possible to share these categories?

Thanks!

terryyz commented 1 month ago

Hi @normster,

Sorry for forgetting to upload the categories. You should be able to see them here. Note that the problem categories are based on the manually labeled library domain categories.

You may also notice that lib2domain.json contains more libraries than the ones covered by BigCodeBench. The list is based on the BigCodeBench, ODEX, and DS-1000.

Cheers

terryyz commented 1 month ago

Ah, I realised that you may be looking for the categorisations in the preference selection phase. I'll share it with you shortly.

terryyz commented 1 month ago

Hi @normster, please check the zip file under https://github.com/bigcode-project/bigcodebench-annotation/blob/main/data_collection/r0.zip, which should cover all the scripts and raw data.

terryyz commented 1 month ago

FYI, I plan to classify the BigCodeBench tasks into several application categories and release the breakdown analysis shortly.