andymeneely / chromium-history

Scripts and data related Chromium's history
11 stars 4 forks source link

Vulnerability Misses on Code Reviews #247

Open sso7159 opened 8 years ago

sso7159 commented 8 years ago

*/ Joint effort with @kbaumzie

A code review is "vuln_missed" if: one or more files in the CR is fixed for a vulnerability in a CR in the next 6 months.

Task 1 - Flag Code Reviews

Task 2 - Run correlations on "vuln_missed" and "vuln_misses" against columns in Code Reviews table:

Task 3 - Move data from developer_snapshots into code_reviews with the following considerations:

  1. The developer_snapshots table represents a developer in the time period of one year.
  2. With proper integration with the code_reviews table, we can represent how this code review 'fits into' the developer's network.
  3. The code_reviews table will now have data relating to all participator's work in the last year.

Task 4 - Run correlations on "flagged_vulnerable" and "num_flagged_vulnerable" against the new columns in Code Reviews table. (To be determined)

andymeneely commented 8 years ago

Let's continue the terminology with vuln_misses. The boolean should be called vuln_missed and the numerical one should be vuln_misses.

kbaumzie commented 8 years ago

Added new metrics to code_review table: vuln_missed and vuln_misses. Modified code_review_analysis.rb to include total churn to get the total number of modifications in the review.

Progress on Task 1: Joining cvenums with code_reviews table to patch_set to analyze patch_set_files

Progress on Task 2: Created file cr_analysis_results.rb to print out all the correlations on our new metrics vuln_misses and vuln_missed against other metrics in the code_reviews table: non_participating_reviews, total_reviews_with_owner, owner_familiarity_gap, total_sheriff_hours, and I included the new metric @sso7159 and I made in our last session: churn that returns the total number of lines added and removed from a file.

kbaumzie commented 8 years ago

Progress on Task 3: Plan to pull values from developer snapshots table per developer within a given start and end date. Interesting metrics to include from developer snapshots table are: closeness, betweenness, sheriff_hours, vuln_misses_1yr, vuln_fixes_owned, perc_vuln_misses, and degree. Sam plans to create new columns in the code reviews table to reflect the new metrics of comparing the developer snapshot metrics with the participation of a developer (avg_part_closeness/sheriff_hours/betweenness/missed_1yr/etc.). Once this is completed, I plan to run correlations on the new columns in the code reviews table with the flagged metrics we created from Task 1-- vuln_misses and vuln_missed.

kbaumzie commented 8 years ago

image