Closed aakashsur closed 1 year ago
Done with the following ruby script:
require 'csv'
valid = (0..100).map(&:to_s)
keys = (1..13).map{|i| "Castle #{i}"}
Dir.glob("castle*.csv").each do |fname|
rows = CSV.read(fname, :headers => true).map(&:to_h)
out = []
rows.each_with_index do |row, i|
invalid = row.keys.select{|k| keys.index(k)}.select{|k| !valid.index(row[k])}
total = row.keys.select{|k| keys.index(k)}.map{|k| row[k].to_i}.sum
if invalid.size > 0 || total != 100
p [fname, i, total, row.select{|k,v| keys.index(k)}]
else
out << row
end
end
CSV.open(fname, 'w') do |csv|
headers = out.first.keys
csv << headers
out.each{|o| csv << headers.map{|h| o[h]}}
end
end
Looks like there are some invalid rows in the data -
https://github.com/fivethirtyeight/data/blob/master/riddler-castles/castle-solutions-3.csv#L238 https://github.com/fivethirtyeight/data/blob/master/riddler-castles/castle-solutions-3.csv#L818 https://github.com/fivethirtyeight/data/blob/master/riddler-castles/castle-solutions-3.csv#L1030
https://github.com/fivethirtyeight/data/blob/master/riddler-castles/castle-solutions-4.csv#L182 https://github.com/fivethirtyeight/data/blob/master/riddler-castles/castle-solutions-4.csv#L278 (O instead of 0) https://github.com/fivethirtyeight/data/blob/master/riddler-castles/castle-solutions-4.csv#L498 https://github.com/fivethirtyeight/data/blob/master/riddler-castles/castle-solutions-4.csv#L853
There's also invalid rows because the number of soldiers does not add up to 100, here are my numbers - 38 invalid rows from first war. 30 invalid rows from second war. 142 invalid rows from third war. 72 invalid rows from fourth war.