andymeneely / chromium-history

Scripts and data related Chromium's history
11 stars 4 forks source link

When should be our cutoff dates? #102

Closed andymeneely closed 10 years ago

andymeneely commented 10 years ago

I'd like to do next-release validation of our experiments, which means this: we need 1-3 cutoff dates for our data. We need these dates (a) make sense in terms of Chromium releases, and (b) have significant security data both before and after this date.

So, please research:

andymeneely commented 10 years ago

These links might help: http://www.chromium.org/developers/calendar http://en.wikipedia.org/wiki/Chromium_browser

Looks like it's rapid releases every 2-3 months. We'll need to narrow down significant releases to just a few somehow.

smt9020 commented 10 years ago

Release Dates:

This is obviously not narrowed down enough, but maybe a good starting point. Major releases spread out over the life of the project.

andymeneely commented 10 years ago

Looks like major releases are about once a year in recent years.

Why aren't releases like 12.0 considered a major release? What's the difference? Chrome?

One thing we need to do is to get the vulnerability data fixed up (#104) so we can see how many vulnerabilities were fixed at each of those releases - we want some before and some after.

andymeneely commented 10 years ago

Ok, so I've written up a query for visualizing when the fixes occurred over time. It uses a histogram of created dates. The dates are in millisecond time, so it's hard to read, but as you can see the vuln fixes occur pretty consistently over time. We do have that one gap in the middle.

By the way, this is pre-fix of #104, so this will change once that data is fixed. I'll put it into run:stats so it builds nightly.

#In the rails console
dates = CodeReview.joins(:cvenums).order(:created).pluck(:created)
agg = Aggregate.new(dates.first.to_f, dates.last.to_f, (dates.last.to_f - dates.first.to_f)/50)
dates.each{|d| agg << d.to_f}
puts agg

And here's the histogram

    value |------------------------------------------------------| count
1220474528 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@                          |    28
1223736677 |@@@                                                   |     3
1226998825 |@@@@@@@@@                                             |     9
1230260973 |@@@@@@@@@@@@                                          |    12
1233523121 |@@@@@@@@@@@@@@@@@@@@@@@@@                             |    25
1236785269 |@@@@@@@@@@@@                                          |    12
1240047417 |@@                                                    |     2
          ~
1246571713 |@@                                                    |     2
1249833861 |@@@@@                                                 |     5
1253096009 |@@@@@@                                                |     6
1256358158 |@@@@                                                  |     4
1259620306 |@@@@@@@@@@                                            |    10
1262882454 |@@@@@@@@                                              |     8
1266144602 |@@@@@@                                                |     6
1269406750 |@@@@@@@@@@@@@@@                                       |    15
1272668898 |@@                                                    |     2
1275931046 |@@@@@@@@                                              |     8
1279193194 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@             |    41
1282455342 |@@@@@@@@@@@@@@@                                       |    15
1285717490 |@@@@@@@@@@@@@                                         |    13
1288979638 |@@@@@@@@@@@@@@@@@@                                    |    18
1292241787 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@                    |    34
1295503935 |@@@@@@@@@@@@@@@@@@@@@@@@@@@                           |    27
1298766083 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@   |    51
1302028231 |@@@@@@@@@@@@@@@@@@                                    |    18
1305290379 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@                   |    35
1308552527 |@@@@@@@@@@@@@@@@@@@@@@@                               |    23
1311814675 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@                     |    33
1315076823 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@                          |    28
1318338971 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@                        |    30
1321601119 |@@@@@@@@@@@@@@                                        |    14
1324863268 |@@@@@@@@@@@@@@@@                                      |    16
1328125416 |@@@@@@@@@@@@@@@@@                                     |    17
1331387564 |@@@@@@@@@@@@@@@@@@@@@@@@                              |    24
1334649712 |@@@@@@@@@@@@@@                                        |    14
1337911860 |@@@@@@@@@@@@@@@                                       |    15
1341174008 |@@@@@@@@@@@@@@@@@                                     |    17
1344436156 |@@@@@@@@@@@@@@@@@                                     |    17
1347698304 |@@@@@@@@@@@@@@@@@@@@@                                 |    21
1350960452 |@@@@@@@@@@@@                                          |    12
1354222600 |@@@@@@@@@@@@@@@                                       |    15
1357484748 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@                      |    32
1360746897 |@@@@@@@@@@@@@@@@@@@@@@                                |    22
1364009045 |@@@@@@@@@@@@@@                                        |    14
1367271193 |@@@@@@@@@@@@@@@@@@@@@@@@                              |    24
1370533341 |@@@@@@                                                |     6
1373795489 |@@@@@@@@@                                             |     9
1377057637 |@@@@@@@@@@@@@@@@@                                     |    17
1380319785 |@@@@@@@@                                              |     8
    Total |------------------------------------------------------|   837
andymeneely commented 10 years ago

In our discussion Friday we talked about how the monthly builds seem to be all treated as normal. I'm not sure how to statistically handle running our analysis about three dozen times without losing statistical significance and dealing with auto-correlation. I need to think about how to handle this more.

andymeneely commented 10 years ago

For the workshop paper, I'm thinking we go with one release. We can make it a feature for next year to expand the analysis to multiple releases, but for now we can split it down the middle. Seems like the midpoint of time is going to be 28 January 2011, so that will be our cutoff date.