Closed HesNobi closed 3 years ago
Hi, @HesNobi @keiohta
This is my quick test for random pickup.
The result shows;
Hi @ymd-h , thanks for measuring the time for random sampling!
I believe allowing duplication should be okay considering
If you two do not have opposite opinion, then I'll change the corresponding part.
Hi,
I've done some digging and apparently using random.sample
is very much faster thannp.random.choice
.
Thanks for keeping this project alive.
I am going to close this issue since it has been acknowledged properly.
I've found this via @ymd-h 's blog article. Do you know if there's numpy's official resource about this? I'd open an issue or PR if not.
I did additional investigation (after PR merged).
Unlike the legacy free function (aka. np.random.choice
), the recommended generator object method (aka. np.random.Generator.choice
) has heuristic algorithm.
In my opinion, we don't need to open a new issue at NumPy repository.
To determine the fastest method for us, we need additional study. (However, I think the study is low priority as long as the current implementation is sufficient.)
If anyone have problem with the current implementation, please feel free to tell us.
Hi. After implementing pre-recorded expert data with the size of 1e6. I have realized that
np.random.choice
withreplace=Fasle
is extremely slow to the point of unusable. (batch size 100)I am wondering if it can be replaced with something faster.
Thanks for the great project.
P.S. I am using the dataset from Berkeley's D4RL project.