rubby33 commented 4 years ago

首先，在gpu有两个显卡，我设置worker_num=3,cuda_devices=(0, 1) ,具体代码如下：

streamer = Streamer(SentenceManagedBertModel, batch_size=64, max_latency=0.1, worker_num=3, cuda_devices=(0, 1))

问题描述：启动服务后，所有的python进程都分配在gpu 0 上，没有分配到gpu 1 上。很奇怪。

+-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 2980 C python 1303MiB | | 0 3132 C ...iangwei/anaconda3/envs/py3.7/bin/python 1303MiB | | 0 3133 C ...iangwei/anaconda3/envs/py3.7/bin/python 1303MiB | | 0 3134 C ...iangwei/anaconda3/envs/py3.7/bin/python 1303MiB | +-----------------------------------------------------------------------------+

rubby33 commented 4 years ago

此外，我打印了对应的log日志，如下

run_forever begin... gpu_id: 1
run_forever begin... gpu_id: 0
run_forever begin... gpu_id: 0
ManagedModel gpud_id: 1
ManagedModel gpud_id: 0
ManagedModel gpud_id: 0
CUDA_VISIBLE_DEVICES: 1
[gpu worker:  3133  init model on gpu: 1
CUDA_VISIBLE_DEVICES: 0
CUDA_VISIBLE_DEVICES: 0
[gpu worker:  3134  init model on gpu: 0
[gpu worker:  3132  init model on gpu: 0

是在原来的ManagedModel增加了一些pirnt 信息，确实是不同worker设置了不同的set_gpu_id，但是貌似并未起作用！

class ManagedModel(object):
    def __init__(self, gpu_id=None):
        self.model = None
        self.gpu_id = gpu_id
        print("ManagedModel gpud_id:",self.gpu_id)
        self.set_gpu_id(self.gpu_id)

    @staticmethod
    def set_gpu_id(gpu_id=None):
        if gpu_id is not None:
            os.environ["CUDA_VISIBLE_DEVICES"] = str(gpu_id)
            print("CUDA_VISIBLE_DEVICES:",os.environ["CUDA_VISIBLE_DEVICES"] )

    def init_model(self, *args, **kwargs):
        raise NotImplementedError

    def predict(self, batch: List) -> List:
        raise NotImplementedError

rubby33 commented 4 years ago

已经解决。

按照示例代码TextInfillingModel，重新将我的基于bert分类模型，也对应放到单独py文件中。

对于模型初始化如下操作： self.model.eval()

self.model.to(self.device) #不能使用这个，否则所有worker 都在同一个gpu 0上

    if torch.cuda.is_available():
        self.device ="cuda"
        print("model to cuda")
    else:
        self.device = "cpu"
        print("model to cpu")

    self.model.to(self.device)

ShannonAI / service-streamer

Streamer worker没有正确分配到多台gpu上 #74

self.model.to(self.device) #不能使用这个，否则所有worker 都在同一个gpu 0上