bigcode-project / bigcode-evaluation-harness

A framework for the evaluation of autoregressive code generation language models.
Apache License 2.0
745 stars 193 forks source link

[WIP] adding ShaderEval tasks #97

Closed Vipitis closed 10 months ago

Vipitis commented 1 year ago

Hey, first PR for me here:

adding ShaderEval task1, this is essentially just implementing the task as is in the EvaluationSuite: return completion. the task is very much just meant as a "proof of concept" as there are several issues with it. I do plan on introducing more tasks to this benchmark soon and also make them generally better. I do have several question too.

some differences that should not impact the results:

concerns I hope to address:

Vipitis commented 1 year ago

converted to draft as development of the next tasks has started on this branch. Will try to add the other tasks when they are ready. I don't plan to change task1 but might improve the implementation.

Vipitis commented 10 months ago

closed due to cleaning up a bunch of stuff. I will open a new draft PR in a few weeks that will hopefully provide a better implementation as well as documentation. Plan currently is to have the data and implementation finished by the end of this year.