Closed YiLing28 closed 4 years ago
streamer = Streamer(ManagedBertModel, batch_size=64, max_latency=0.1, worker_num=4, cuda_devices=(2, 3))
基本上是按照flask_multigpu_example.py文件来写的。
export CUDA_VISIBLE_DEVICES=2,3 你试试,应该能解决你的问题。我就是这么做的。
streamer = Streamer(ManagedBertModel, batch_size=64, max_latency=0.1, worker_num=4, cuda_devices=(2, 3))