Derek-Jones / ESEUR-book

Issue handling for Evidence-based Software Engineering: based on the publicly available data
http://www.knosof.co.uk/ESEUR/
277 stars 18 forks source link

Cloning code smell #12

Open LarsAsplund opened 3 years ago

LarsAsplund commented 3 years ago

I really enjoy seeing open research based on publically available data! Well done.

I have some comments on the claim that research has showed that code cloning isn't the bad practice it is said to be.

It is true that if a bug is found in one clone you need to be aware of the other clones or the bug will remain. This is a maintenance burden and one of the reason for it to be considered a bad practice. In general any update to a clone leads to the problem of being aware of other instances.

A code snippet that is frequently cloned suggests that the snippet has a very distinct functionality and that code readability would be improved by refactoring it into a well-named function/method. Code cloning is also considered a bad small because it indicates lack of code structure.

From what I understand the referenced papers show a correlation between cloned code and lower bug frequency. Is that correlation really because of cloning or is it because of reuse? The more code is reused the more mature it becomes. This holds true even if code is reused through code sharing. I don't think they've made such a comparison but I admit I didn't read the papers thoroughly. Correlation but not necessarily the cause.

Derek-Jones commented 3 years ago

Do clones have distinct functionality?

Clone detection works on sequences of tokens, not functionality. There is no requirement that the sequence of tokens be capable of being extracted and placed in a function.

There are probably some sequences that could be placed in a function. I am not aware of any data on the interesting question of how many clones are easily separable into functions..

The term code readability is a meaningless marketing term.

All kinds of badness is attributed to cloning, using argument that are invariably based on ego and bluster, i.e., no evidence.

The argument against cloning is based on the faults that are experienced. Where is the evidence showing that the alternatives are less prone to fault experiences? Refactoring is claimed as a solution without any evidence that fewer mistakes will be made. It's just that the evidence for the mistakes is not available.

Yes, the data shows a correlation, not a causation.

Showing causation is a difficult route. An easier route might be to show that refactoring really was a viable, cost effective alternative (rather than wave arms and claim it).

Points to data most welcome.

LarsAsplund commented 3 years ago

Do clones have distinct functionality?

No, functionality was a bad word for it. Clones are something that provide value or they wouldn't be reused. It can be functionality but also other things like data structures.

Clone detection works on sequences of tokens

They do, but the code duplication of concern when discussing code smells is more than that. The syntactically similar code found by these tools often comes from copy-paste but a clone can also be developed independently by someone not aware that the code has already been developed elsewhere. Such clones are often syntactically different despite being functionally identical. Clone detection tools also tend to generate false positives. Syntactically similar code that are significantly different when you look at its purpose

The term code readability is a meaningless marketing term.

How did you come to that conclusion? The research I've seen typically claim that readability has value (e.g. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.175.8720&rep=rep1&type=pdf). Maintaining code means that you need to understand it. Hard to read code creates a higher cognitive load which limits what we can easily understand.

The argument against cloning is based on the faults that are experienced.

I think the referenced papers are too focused on bugs. Code duplication is not just about bugs (https://www.informit.com/articles/article.aspx?p=457502&seqNum=5). For me it's primarily about maintanance/modularity/readability.

Where is the evidence showing that the alternatives are less prone to fault experiences? Refactoring is claimed as a solution without any evidence that fewer mistakes will be made. It's just that the evidence for the mistakes is not available.

I'm not sure that there is such evidence (I haven't looked) and I was hoping that the referenced papers would have made that analysis. Refactoring is not without risks and sometimes it's recommended not to do it. It would have been nice with some results in that area.

I acknowledge that code with clones has a lower bug frequency but what do that really tell us? Let's assume that all newly written code has a bug frequency of f bugs/lines of code (small number hopefully) and that all reused code is very mature and has no bugs (in reality one would expect a lower bug frequency but not zero). If I have access to such a mature code snippet and use it repeatedly until half of my code consist of that clone my bug frequency drops to 0.5f. If I then refactor and put that snippet in a function the bug frequency will go up again. Not because I have more bugs but because my code size is reduced.

Yes, the data shows a correlation, not a causation.

Showing causation is a difficult route.

It is but correlations without interpretation can be very misleading

An easier route might be to show that refactoring really was a viable, cost effective alternative (rather than wave arms and claim it).

I'm not sure it would be easier but it would definitely be better. The important thing is to also look at the cost of maintenance, not only the cost of bugs.

Derek-Jones commented 3 years ago

I analyse the data from the readability paper you cite. The authors did not specify what readability was, they simply took what the student believed it to be (as indicated by their ratings). The agreement between student ratings improved as students spent more time at university, i.e., their views started to converge to something (but nobody knows what).

I have Diomidis Spinellis's book you cite, which contains interesting opinions. Diomidis has since done some interesting empirical work, but not directly related to this area (at least that I can recall).

I agree the issue is maintainability, not fault reports. But fault report data is available, so that is what researchers count. Maintainability is another marketing term that people use in a way that applies to what they are researching. Maintenance is actually changing the code to adjust to a changed world. If the world did not change there would be no need for maintenance.

LarsAsplund commented 3 years ago

The authors did not specify what readability was,

They did actually specify readability as "a human judgment of how easy a text is to understand". That may not be all that bad but it would have been better with a more quantitative measure.

I agree the issue is maintainability, not fault reports. But fault report data is available, so that is what researchers count.

Producing papers for the sake of producing papers. I agree with what you said in the beginning of your book, Without sufficient funding we can't really expect much.

If the world did not change there would be no need for maintenance.

With a very connected world it's possible to deploy updated software continuously. That enables companies to quickly respond to the ever changing requirements of reality. It used to be that change was considered a failure caused by bad requirement and design phases. Today the ability to manage change is considered a competitive edge. Maintenance becomes important.

LarsAsplund commented 3 years ago

Maybe we're getting of topic. I'm sure there are examples where evidence based research prove the common opinion wrong. Not sure the research showing that cloning is great is a one of those examples.

Derek-Jones commented 3 years ago

The definition "a human judgment of how easy a text is to understand" is useless. Everybody has to give the same answer for the same code, or at least vary in predictable ways.

I'm not saying that cloning is great, just reporting the result (there has been a multi-year thread of research papers saying it was bad, with no evidence to back this up).

LarsAsplund commented 3 years ago

The definition "a human judgment of how easy a text is to understand" is useless. Everybody has to give the same answer for the same code, or at least vary in predictable ways.

Even if they designed their study to come up with more objective numbers for readability we cannot expect consistent numbers. If a code snippet is based on a design pattern it is easy to read for someone experienced with that pattern but it may be very tricky for someone who's not. Participants can have experiences with different patterns so code that is readable for one is not for the other and vice versa. The best you can hope for is to find code properties that makes code less readable for most people.

I'm not saying that cloning is great, just reporting the result (there has been a multi-year thread of research papers saying it was bad, with no evidence to back this up).

What your book is saying is that there is no evidence supporting that cloning is a bad practice and then make the argument that the opposite is true by referencing papers that address bug frequency. What I'm saying is that bug frequency isn't the full reason for the do not clone practice. In addition I find that the papers can't make the claims they do. Since your book is about evidence based software engineering I would find another make-the-reader-interested example

Derek-Jones commented 3 years ago

I know of no evidence that cloning is bad (for some definition of bad). Show me the data and I will be happy to discuss it.

I am just discussing the available data.

LarsAsplund commented 3 years ago

I didn't really object to the claim that there is no evidence to support that cloning is bad practice. I hadn't looked into the research so I didn't know.

I have the refactoring book you're referring to and it says

Number one in the stink parade is duplicated code. If you see the same code structure in more than one place, you can be sure that your program will be better if you find a way to unify them.

Not a shred of evidence but it points to the central question. Is cloning/code duplication worse than sharing/unifying code?

The papers you referenced do not address that question and cannot be used to claim that the opposite is true, i.e. cloning is a good practice.

I understand that clone detection tools and a comparison between cloned and non-cloned code makes it much easier to do research on large open-source projects. However, if the comparison is irrelevant there will not be any good conclusions. Also, clone detection tools tend to produce false positives which adds to the problem.

I did a quick search for papers that avoid the faulty comparison and also look at specific classes of clones that are less likely to be false positives, These papers show that the problems associated with clones are real and significant

http://swat.polymtl.ca/~foutsekh/docs/Barbour-JSME.pdf addresses the maintenance problem and shows how the presence of unsynchronized clones in real code significantly increases the number of bugs.

https://www.usenix.org/legacy/event/osdi04/tech/full_papers/li_z/li_z.pdf demonstrates a tool that detects a specific class of programmer mistakes associated with cloning. The tool found 49 such mistakes in Linux. 28 were verified to be real bugs and the remaining 21 were not bugs by coincidence.

I've still to find a paper investigating the central question. Refactoring also has its problems and without a proper comparison we cannot know what the best approach is. Most likely it depends on the situation. For example, if you make a clone to start your new work that you know will be significantly different in the end there is no point in sharing that code just to please your clone detection tool.