Was a correction made as a result of the code review?

smt9020 commented 10 years ago

According to Alberto's paper, there are a total of 413,110 patch sets for the 159,254 code reviews we scraped. This gives us an average of 2.71 patch sets per code review. However, 51% of the reviews have a single patch set, while the remaining 49% has multiple. If a reviewer puts a comment, especially one containing "PTAL" (please take another look) and then another patch set is added, this means that a vulnerability was potentially caught.

To check for this, you will need to compare the time stamps on the patch sets and the time stamps on the comments. If a comments time stamp is in between two patch sets, then ideally it was because a vulnerability was discussed in that comment.

Create a list of the review numbers where this happens so we can see how frequently we're catching something versus just a simple update.

andymeneely commented 10 years ago

@bspates will be secondary on this one

andymeneely commented 10 years ago

Do we have anything else to go on for this than dates? Maybe I revised my patch set but not in respond to the feedback? Some examples to talk about would be nice.

It would be interesting to classify a given code review as a single patch set or multiple, then aggregate over a filepath's history. Maybe "% of Single Patch Set Reviews" would be an indicator.

bspates commented 10 years ago

How likely is it that a PTAL to relates directly to security? Is it more likely to relate to code style, basic coding mistakes, or algorithmic changes?

cketant commented 10 years ago

Also, I'm assuming that by "correction" we are referring to the fixing of the bug that was causing the vulnerability?

Also, I think the best way to determine this would be to draw a correlation instead of a hard conclusion.

andymeneely commented 10 years ago

@bspates Yes, PTAL can apply to a multitude of things beyond security. But, we are analyzing the effect of certain metrics on vulnerabilities in particular. One result might be: "Vulnerable filepaths on average have 60% single-patchset reviews while neutral files on average have 40%."

@cck9672 Yes, these will be correlations.

We don't know when the vulnerabilities were introduced into the system, but we do know that files have been inspected dozens or even hundreds of times and the vulnerabilities were still missed. So what's different about the code review history of vulnerable filepaths as opposed to neutral filepaths?

andymeneely commented 10 years ago

Did we come up with any metrics here? Or are we stuck?

cketant commented 10 years ago

Yeah a metric for this question is pretty difficult to generate and hard establishing some kind of correlation between the two. I would like to follow up in meeting to see if there is something I am not seeing.

cketant commented 10 years ago

While I believe this question to be very important for our research, I am inclined to put this on ice for the next iteration for this research. I have not been able to identify a strong enough metric associated with providing an answer for this question as well as I do not want to expend anymore time on it.

Some thoughts on potential metrics for this from myself include the following:

Looking at the developers who are participating on the Code Review (CR). If the developer is considered "security experienced" and the current vulnerability is related to the one that the developer has worked on then there exist a possibility that the CR is bug fixing.
Looking at the number of participants (not necessarily developers) on the CR.
Looking at the amount of back and forth dialogue, interaction, general back-and-forth on CR.
If we could take an assessment (using the metrics we have identified up until now) of all the neutral filepaths and compare that assessment to the vulnerable filepaths, then we could possibly identify a metric.
Compare the activity surrounding netural filepaths vs vulnerable filepaths.

andymeneely / chromium-history

Was a correction made as a result of the code review? #71