If you want to sample k items out of n, Fisher Yates shuffles your item array.
Hereunder is an efficient no shuffle implementation. Something better can probably be achieved with prefix sums and rank/select.
Theorem
We can sample k items out of n with k random draws and O(k log k) time, without modifying the items array.
Proof
TODO fix proof
Given n items, at draw i=1,2,...,n with a binary search tree containing all previously drawn items indices between 0 and n-1, pick a random integer x between 0 and n - i. Binary search for its corresponding index value as follows: start with p = 0 the number of predecessors of the current node, let l be the size of the left subtree of the current node, let v be the value of the current node, if x + p + l + 1 < v
the current node becomes the left child, otherwise the current node becomes the right child and p = p + l + 1. If there is no such child, insert x + p in the case of a left child, x + p + l + 1 in the case of a right child.
If x + p + l + 1 < v, then x cannot be in the right subtree of v. The item index corresponding to x would be x + p + l + 1 if it was the right child of v which is smaller than v. Pick any node c in the right subtree of v. Let g be the number of predecessors of c in the right subtree of v. The value of c is at least v + 1 + g because all indices are distinct. Suppose c has no right child. The item index of x as a right child of c is x + p + l + 1 + g + 1 < v + g + 1 <= c. Suppose c has no left child. The item index of x as a left child of c is x + p + l + 1 + g < v + g < v + g + 1 <= c.
If x + p + l + 1 > v, then x cannot be in the left subtree of v. If v has no left child, then l = 0 and the value of x as the left child of v would be x + p which is larger or equal to v (indices must be distinct). Pick any node c in the left subtree of v.
Let s be the number of successors of s in the left subtree of v. The value of c is at most v - s - 1 because all indices are distinct. Suppose c has no left child. The item index of x as a left child of c is x + p + l - s - 1 > v - s - 2 so x + p + l - s - 1 >= v - s - 1 = c.
If you want to sample k items out of n, Fisher Yates shuffles your item array. Hereunder is an efficient no shuffle implementation. Something better can probably be achieved with prefix sums and rank/select.
Theorem
We can sample k items out of n with k random draws and O(k log k) time, without modifying the items array.
Proof
TODO fix proof
Given n items, at draw i=1,2,...,n with a binary search tree containing all previously drawn items indices between 0 and n-1, pick a random integer x between 0 and n - i. Binary search for its corresponding index value as follows: start with p = 0 the number of predecessors of the current node, let l be the size of the left subtree of the current node, let v be the value of the current node, if x + p + l + 1 < v the current node becomes the left child, otherwise the current node becomes the right child and p = p + l + 1. If there is no such child, insert x + p in the case of a left child, x + p + l + 1 in the case of a right child.
If x + p + l + 1 < v, then x cannot be in the right subtree of v. The item index corresponding to x would be x + p + l + 1 if it was the right child of v which is smaller than v. Pick any node c in the right subtree of v. Let g be the number of predecessors of c in the right subtree of v. The value of c is at least v + 1 + g because all indices are distinct. Suppose c has no right child. The item index of x as a right child of c is x + p + l + 1 + g + 1 < v + g + 1 <= c. Suppose c has no left child. The item index of x as a left child of c is x + p + l + 1 + g < v + g < v + g + 1 <= c.
If x + p + l + 1 > v, then x cannot be in the left subtree of v. If v has no left child, then l = 0 and the value of x as the left child of v would be x + p which is larger or equal to v (indices must be distinct). Pick any node c in the left subtree of v. Let s be the number of successors of s in the left subtree of v. The value of c is at most v - s - 1 because all indices are distinct. Suppose c has no left child. The item index of x as a left child of c is x + p + l - s - 1 > v - s - 2 so x + p + l - s - 1 >= v - s - 1 = c.