seketeam / EvoCodeBench

An Evolving Code Generation Benchmark Aligned with Real-world Code Repositories
Apache License 2.0
31 stars 2 forks source link

Environment setup issues #4

Open DrozdikGleb opened 3 months ago

DrozdikGleb commented 3 months ago

Hello, thank you for the open dataset and very useful paper

I've encountered some issues with setting up the environment. In some projects, it's not enough to just install the dependencies from requirements, therefore, some samples fail the tests due to an incorrectly set up environment. For example, in the project nlm-ingestor, it's necessary to download dictionaries from nltk. In the project camp_zipnerf, the new version of scipy.linalg lacks the tril function. In ollama, there are missing pytest_httpserver and pillow. After fixing some of these issues, the pass@1 for gpt-4 increased from 20.73 (as in your article) to 26.54%.

I also wanted to ask why in EvoCodeBench you didn't consider app-specific imports, siblings, and similar names files, as was done in DevEval. It seems that at least app-specific imports should be useful.

LJ2lijia commented 3 months ago

Thank you for your suggestion. We are making a docker image that contains a complete running environment. You are welcome to share your configured environment! Besides, due to the limited computing budgets, we leave the introduction of cross-file contexts (e.g., imports, siblings, and similar name files) to future work.