np.random.choice with replace=False allocates large amounts of memory linear with number of items to choose from. This PR replaces np.random.choice with Robert Floyd's No-Replacement Sampling algorithm which is linear with how many items we are choosing (which is generally a much smaller number than the items we can choose from for our use case).
-- This method is nearly 1000x faster when the population is greater than 10^9
-- The runtime of the simulation for vector_of_counts-4096-ln3-sequential on the smoke_test scenarios is about 3 seconds with this fix on a Google Pixelbook. Without this fix, the runtime is over 3 minutes.
np.random.choice with replace=False allocates large amounts of memory linear with number of items to choose from. This PR replaces np.random.choice with Robert Floyd's No-Replacement Sampling algorithm which is linear with how many items we are choosing (which is generally a much smaller number than the items we can choose from for our use case).
-- This method is nearly 1000x faster when the population is greater than 10^9 -- The runtime of the simulation for vector_of_counts-4096-ln3-sequential on the smoke_test scenarios is about 3 seconds with this fix on a Google Pixelbook. Without this fix, the runtime is over 3 minutes.