MD5.hexdigest provides a flat (but noisy) distribution of hex digits, but the typo converting them back to integers with base 17 adds a very slight bias to the results, increasing the size of some low-order buckets. eg this calculation using a range instead of MD5 to provide the evenly distributed input: (0..9999999).each_with_object(Hash.new(0)) {|d, h| h[d.to_s(16).to_i(17) % 100 < 5] += 1 } => {true=>500025, false=>9499975} . It's only a slight loading of the dice, and would be very hard to spot in real experiments because it's lost in the noise of the MD5 distribution. Spotted via code review not tests.
MD5.hexdigest provides a flat (but noisy) distribution of hex digits, but the typo converting them back to integers with base 17 adds a very slight bias to the results, increasing the size of some low-order buckets. eg this calculation using a range instead of MD5 to provide the evenly distributed input: (0..9999999).each_with_object(Hash.new(0)) {|d, h| h[d.to_s(16).to_i(17) % 100 < 5] += 1 } => {true=>500025, false=>9499975} . It's only a slight loading of the dice, and would be very hard to spot in real experiments because it's lost in the noise of the MD5 distribution. Spotted via code review not tests.