Closed linzehua closed 5 years ago
If cpu inference meets the requirements (e.g. latency), and we do not aim to do batch inference, cpu would be preferable as it would be cheaper and less overhead. Otherwise, we may go with GPU inference (especially in the case of batch inference).
If cpu inference meets the requirements (e.g. latency), and we do not aim to do batch inference, cpu would be preferable as it would be cheaper and less overhead. Otherwise, we may go with GPU inference (especially in the case of batch inference).