assaf / vanity

Experiment Driven Development for Ruby
http://vanity.labnotes.org
MIT License
1.55k stars 269 forks source link

Fix 'base 17' typo, correct id bucketing #308

Closed bazzargh closed 7 years ago

bazzargh commented 7 years ago

MD5.hexdigest provides a flat (but noisy) distribution of hex digits, but the typo converting them back to integers with base 17 adds a very slight bias to the results, increasing the size of some low-order buckets. eg this calculation using a range instead of MD5 to provide the evenly distributed input: (0..9999999).each_with_object(Hash.new(0)) {|d, h| h[d.to_s(16).to_i(17) % 100 < 5] += 1 } => {true=>500025, false=>9499975} . It's only a slight loading of the dice, and would be very hard to spot in real experiments because it's lost in the noise of the MD5 distribution. Spotted via code review not tests.

phillbaker commented 7 years ago

Thanks!