andymeneely / chromium-history

Scripts and data related Chromium's history
11 stars 4 forks source link

About 4,156 BugCommits are still dangling #196

Closed andymeneely closed 9 years ago

andymeneely commented 9 years ago

Based on the latest build, we've got 8,358 bugs dangling. We know that 4,202 bugs led to a 404 or 403 HTTP error. So now we've got 4,156 bugs not accounted for. Let's get those bugs, or figure out why we can't download them.

andymeneely commented 9 years ago

Nevermind!! Looks like every bug is accounted for. When I did my outer join and added a DISTINCT, I came to exactly 4,202 bugs. So we're good. Here's my rails console history in case you want to see how I did that.

irb(main):002:0> require 'csv'
=> true
irb(main):004:0> accounted_for = []
=> []
irb(main):007:0> csv
=> <#CSV io_type:File io_path:"../realdata/bugs/error_log.csv" encoding:UTF-8 lineno:4203 col_sep:"," row_sep:"\n" quote_char:"\"">
irb(main):008:0> csv.first
=> nil
irb(main):009:0> csv = CSV.open('../realdata/bugs/error_log.csv', headers: true)                                                                                                                               => <#CSV io_type:File io_path:"../realdata/bugs/error_log.csv" encoding:UTF-8 lineno:0 col_sep:"," row_sep:"\n" quote_char:"\"" headers:true>
irb(main):010:0> csv.headers
=> true
irb(main):013:0> csv.each {|row| accounted_for << row[0].to_i}
=> nil
irb(main):014:0> accounted_for
irb(main):015:0> accounted_for.size
=> 4202
irb(main):016:0> accounted_for.uniq.size
=> 4202
irb(main):018:0> query = "SELECT commit_bugs.bug_id FROM bugs RIGHT OUTER JOIN commit_bugs ON bugs.bug_id=commit_bugs.bug_id WHERE bugs.bug_id IS NULL"                                                        => "SELECT commit_bugs.bug_id FROM bugs RIGHT OUTER JOIN commit_bugs ON bugs.bug_id=commit_bugs.bug_id WHERE bugs.bug_id IS NULL"
irb(main):019:0> ActiveRecord::Base.connection.execute query
=> #<PG::Result:0x000000040f3fa8>
irb(main):020:0> Hirb.enable
=> true
irb(main):021:0> ActiveRecord::Base.connection.execute query
=> #<PG::Result:0x00000004087d58>
irb(main):022:0> rs = ActiveRecord::Base.connection.execute query                                                                                                                                              => #<PG::Result:0x00000004092d70>
irb(main):023:0> rs.to_a
irb(main):024:0> outer_join = []; rs.each {|row| outer_join << row['bug_id'] }
=> #<PG::Result:0x00000004092d70>
irb(main):025:0> outer_join
irb(main):026:0> outer_join.size
=> 8358
irb(main):027:0> outer_join.uniq.size
=> 4202
irb(main):028:0> query = "SELECT DISTINCT commit_bugs.bug_id FROM bugs RIGHT OUTER JOIN commit_bugs ON bugs.bug_id=commit_bugs.bug_id WHERE bugs.bug_id IS NULL"
=> "SELECT DISTINCT commit_bugs.bug_id FROM bugs RIGHT OUTER JOIN commit_bugs ON bugs.bug_id=commit_bugs.bug_id WHERE bugs.bug_id IS NULL"
irb(main):029:0> rs = ActiveRecord::Base.connection.execute query
=> #<PG::Result:0x0000000592ff40>
irb(main):030:0> rs.to_a.size
=> 4202