Closed mikygit closed 2 years ago
Actually I was wrong, it is much faster wieh client and server are sharing the same process.
Does this mean gRPC is the bottleneck? :-(
gRPC will only be a bottleneck when running at extremely high QPS (hundreds of thousands of QPS) and should not be a bottleneck fo your setup.
Am I correct in saying that Reverb is reporting that 45k items/s are being sampled? But you are only seeing 1 item/s being sampled? It sounds like there may be a mismatch between what you and Reverb are calling an item (or perhaps you're not using some of the items returned by Reverb?). In Reverb an item is a "sampleable" unit. ML workloads often sample a batch of these (looks like 256 in your case) to take a single (SGD) step.
I would also encourage you to look into using the TrajectoryDataset
instead of Client.sample()
, which will have higher Python overhead costs. This recommendation holds even if you're using pytorch (we use it with jax).
Hi, Thank for your answer. I understand your point but still, in the end, sampling the data is faster when the sampler and the server share the same process (=withtout gRPC) than if they are in 2 seperate processes. At least the conclusion of my tests...
Are you saying it should not?
There will be some overhead when going through gRPC but this should only be a problem if you are pushing a lot of QPS. I would be surprised to see you being affected by this at 45k inserts/s, how many samples/s are you seeing?
From a client side perspective, 64 sampling of 256 'items' (quadriplets) is ~8 times slower with gRPC (=when server and sampler do not share same process) than without. This, for a constant server_info current_size of 4000.
Am I the only one having these numbers?
Is the data being added to the table in the background? Could sampling be slow due to the setting of the rate limiter which could block sampling? What is the CPU usage of Reverb? If it is low, this is most likely the rate_limiter or networking throughput issue.
Closing this one as this is an old issues. Please reopen if this is still a problem.
Hello, I've been playing with reverb for a few days and I notoced that sampling is terribly slow as opposed to insertions.
Although the server_info tells me that I'm sampling at a rate or approximately 45k items per sec, on the client side, it's more close to 1 item per sec :-(
My client and server are running on 2 seperate machines at the moment but I also tried having both the client and the server on the same machine (and also in the same process) in vain. It's still very slow.
Stored data are quadriplet of numpy arrays (obs, actions, ...) The config of the server is the one found in the Readme, nothing fancy. The client retrieves data per 256 num_samples. 64 times. I tried to sample everything in once with no effect. I'm using client.sample(table_name, num_samples) to sample. There are up to one million items in the table but it is slow even with 2000 items. And I'm using pytorch.
I'm a bit disapointed ... Any idea, recommandations?
Thanx!