Closed rubby33 closed 4 years ago
此外,我打印了对应的log日志,如下
run_forever begin... gpu_id: 1
run_forever begin... gpu_id: 0
run_forever begin... gpu_id: 0
ManagedModel gpud_id: 1
ManagedModel gpud_id: 0
ManagedModel gpud_id: 0
CUDA_VISIBLE_DEVICES: 1
[gpu worker: 3133 init model on gpu: 1
CUDA_VISIBLE_DEVICES: 0
CUDA_VISIBLE_DEVICES: 0
[gpu worker: 3134 init model on gpu: 0
[gpu worker: 3132 init model on gpu: 0
是在原来的ManagedModel增加了一些pirnt 信息,确实是不同worker设置了不同的set_gpu_id,但是貌似并未起作用!
class ManagedModel(object):
def __init__(self, gpu_id=None):
self.model = None
self.gpu_id = gpu_id
print("ManagedModel gpud_id:",self.gpu_id)
self.set_gpu_id(self.gpu_id)
@staticmethod
def set_gpu_id(gpu_id=None):
if gpu_id is not None:
os.environ["CUDA_VISIBLE_DEVICES"] = str(gpu_id)
print("CUDA_VISIBLE_DEVICES:",os.environ["CUDA_VISIBLE_DEVICES"] )
def init_model(self, *args, **kwargs):
raise NotImplementedError
def predict(self, batch: List) -> List:
raise NotImplementedError
已经解决。
按照示例代码TextInfillingModel,重新将我的基于bert分类模型,也对应放到单独py文件中。
对于模型初始化如下操作: self.model.eval()
if torch.cuda.is_available():
self.device ="cuda"
print("model to cuda")
else:
self.device = "cpu"
print("model to cpu")
self.model.to(self.device)
首先,在gpu有两个显卡,我设置worker_num=3,cuda_devices=(0, 1) ,具体代码如下:
streamer = Streamer(SentenceManagedBertModel, batch_size=64, max_latency=0.1, worker_num=3, cuda_devices=(0, 1))
问题描述:启动服务后,所有的python进程都分配在gpu 0 上,没有分配到gpu 1 上。很奇怪。
Thu Jun 18 14:25:51 2020
+-----------------------------------------------------------------------------+ | NVIDIA-SMI 430.26 Driver Version: 430.26 CUDA Version: 10.2 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce RTX 208... Off | 00000000:02:00.0 Off | N/A | | 40% 48C P8 12W / 250W | 5222MiB / 11018MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 1 GeForce RTX 208... Off | 00000000:81:00.0 Off | N/A | | 32% 44C P8 24W / 250W | 10MiB / 11019MiB | 0% Default | +-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 2980 C python 1303MiB | | 0 3132 C ...iangwei/anaconda3/envs/py3.7/bin/python 1303MiB | | 0 3133 C ...iangwei/anaconda3/envs/py3.7/bin/python 1303MiB | | 0 3134 C ...iangwei/anaconda3/envs/py3.7/bin/python 1303MiB | +-----------------------------------------------------------------------------+