Open DrozdikGleb opened 3 months ago
Thank you for your suggestion. We are making a docker image that contains a complete running environment. You are welcome to share your configured environment! Besides, due to the limited computing budgets, we leave the introduction of cross-file contexts (e.g., imports, siblings, and similar name files) to future work.
Hello, thank you for the open dataset and very useful paper
I've encountered some issues with setting up the environment. In some projects, it's not enough to just install the dependencies from requirements, therefore, some samples fail the tests due to an incorrectly set up environment. For example, in the project nlm-ingestor, it's necessary to download dictionaries from nltk. In the project camp_zipnerf, the new version of scipy.linalg lacks the tril function. In ollama, there are missing pytest_httpserver and pillow. After fixing some of these issues, the pass@1 for gpt-4 increased from 20.73 (as in your article) to 26.54%.
I also wanted to ask why in EvoCodeBench you didn't consider app-specific imports, siblings, and similar names files, as was done in DevEval. It seems that at least app-specific imports should be useful.