Open assert-bot opened 6 years ago
Out of curiosity, how many PRs did Repairnator submit that got rejected, if any? (@miraculixx on Medium)
Three PRs were unanswered, two were rejected.
I don't want to offend you, but does it more appropriate to call it NPEnator? And most-time NPE is just a sign of problem, you cannot just cover NPE while the problem is still lurking in the software. But still nice try :wink:
The fix in numer 8 is broken, and has the potential to cause a new NPE further down.
I'm curious as to why the bot thought that returning null would be a good idea, when it should have returned the empty string?
Just tried to answer your remarks below but for more rationale about the strategy on NPE I invite you to read the paper on NPEFix which is the tool we use in Repairnator: Dynamic Patch Generation for Null Pointer Exceptions Using Metaprogramming
I don't want to offend you, but does it more appropriate to call it NPEnator?
No offense taken :) The goal of Repairnator is to run several repair tools to try different strategies. The fact is, it's easier to fix a NPE than other kind of defect, that's why we mostly had accepted PR for fixing NPE.
And most-time NPE is just a sign of problem, you cannot just cover NPE while the problem is still lurking in the software.
Actually we can. As you said there's no generic rule for NPE: sometimes it's indeed a sign than something wrong happened before, sometimes the null value at this place is expected. Our goal here is only to try an automated approach to fix those NPE that occurs in the unit tests of the software. So we synthesize a fix that makes the whole unit test passing and then we propose it to the developers: their job is then to review our PR and to assess that indeed the fix is good.
The fix in numer 8 is broken, and has the potential to cause a new NPE further down.
As I just said, for the bot, the software is considered as repaired as long as the whole unit test suite is passing with the fix. It's the job of the software maintainer to assess that indeed the fix is good. And in case of number 8, the maintainers accepted it.
I'm curious as to why the bot thought that returning null would be a good idea, when it should have returned the empty string?
Actually it might have proposed another patch with returning an empty string: for each bug we can receive a bunch of patches. We generally submitted the first one to avoid spamming the maintainer with PRs.
I think what you're doing is interesting, but you have a long way to go. (You probably realise this yourselves.)
As for having multiple patches, and selecting the best -- I think this could actually be more helpful, if the bot itself cannot properly determine which fix is the best one. Getting a mail saying "I'm a bot and I propose one of these solutions to your current NPE in testing" might actually be useful.
Then again, your goal was not to provide free technical support to github projects, but to prove that your AI can write "human level" quality fixes. However, I'm not sure I agree that your bot passed. Let's go back to the number 8 fix. Yes, the maintainer approved it. However, I argue that the maintainer should not have done so, and the only thing you have proven here is that some maintainers are sloppy, inexperienced or inattentive and accept sub-quality patches.
I also call into question that "it's s the job of the software maintainer to assess that indeed the fix is good". It ain't not! Of course, the maintainer has the final say of accepting the patches, but the understood convention is that the submitter of the patch has good reason to believe it's a good patch. This means that the work of of maintainer is to confirm the submitter's assessment of the fix is correct, not to be the sole person to perform this assessment. If this was the generally agreed-upon job for the maintainer, most if not all of your now "accepted" patches would not have been so.
So, sorry to rain on your parade, but I do not believe you proved what you claim you have proved. That being said, it was an interesting (albeit slightly unethical) experiment. And with more improvements on the AI, at some point you will probably be able to produce human-level quality fixes. But this is not it.
I think what you're doing is interesting, but you have a long way to go.
You're right, this is why it's research.
Of course, the maintainer has the final say of accepting the patches, but the understood convention is that the submitter of the patch has good reason to believe it's a good patch.
Agree. Collaborative software development on Github is full of unspoken conventions such as this one. This is a full-blown research question to see how bot-based contributions can fit into this social framework. I hope that future work in the Repairnator project will contribute to answer the question (we have an ongoing proposal with @xLeitix related to this).
@surli You reminded me something about "automatic software testing" I met years ago. They can only do NPE or boundary testing. As long as the bot does not know what the software is expected to do(aka. the expected result), it cannot fix meaningful bugs.
All the activity of Luc Esape / Repairnator is public, you can browse its dashboard.
Here is the list of pull-requests made by Repairnator:
https://github.com/donnelldebnam/CodeU-Spring-2018-29/pull/58 (June 25 2018, mistake)