Closed LukeLIN-web closed 1 year ago
What is the difference between 'auto_depatch' ,'fixed_depatch' ?
Why we need test_request_from_local_preparation in torch-quiver/examples/serving/reddit/reddit_serving.py ?
The purpose of this function is to test the throughput and latency performance of the Quiver-serving system on a particular dataset by generating inputs with different batch sizes. We can determine the maximum PSGS (as mentioned in the paper) to be limited under a given latency bound based on the system's performance under different PSGS. Additionally, we can decide whether to use the CPU or GPU as the sampler.
What is the difference between 'auto_depatch' ,'fixed_depatch' ?
In auto_depatch mode, the system automatically decides whether to use the CPU or GPU as the graph sampler based on the PSGS of the current request batch and a preset threshold. In fixed_depatch mode, the system uses either the CPU or GPU as specified by the user.
What is the difference between 'auto_depatch' ,'fixed_depatch' ?
In auto_depatch mode, the system automatically decides whether to use the CPU or GPU as the graph sampler based on the PSGS of the current request batch and a preset threshold. In fixed_depatch mode, the system uses either the CPU or GPU as specified by the user.
Thank you! It solves my problem.
Why we need test_request_from_local_preparation in torch-quiver/examples/serving/reddit/reddit_serving.py ?