giganticode / run_bug_run

The RunBugRun dataset of executable bugs
23 stars 4 forks source link

It seems that the Uint Tests are not match with the code #2

Open old-kai opened 1 year ago

old-kai commented 1 year ago

For example, see the python test split example in python_test0.jsonl:

{"id":6584,"buggy_submission_id":3590,"fixed_submission_id":3591,"problem_id":"p00000","user_id":"u009980501","buggy_code":"for i in range(1, 10):\n for j in range(1, 10):\n print(str(i) +\"\" +str(j) +\"=\" + str(ij))","fixed_code":"for i in range(1, 10):\n for j in range(1, 10):\n print(str(i) +\"x\" +str(j) +\"=\" + str(i*j))","labels":["literal.string.change","call.arguments.change","expression.operation.binary.change","io.output.change"],"change_count":1,"line_hunks":1}

the Unit Test in tests_all.jsonl file is :

{"id":6584,"problem_id":"p02860","input":"1\nz","output":"No"}

But obviously the Unit Test can not suit for the python code example. Is there any misunderstanding for the dataset?

furunkel commented 1 year ago

Hi, to match bugs with test cases the problem_id field should be used (as you can see, they don't match). Please note that I'm currently revising the whole dataset, so the split is likely to change in the future.