因为本地GPU太旧，连用远程模式也会出错误

jacksong18 commented 3 years ago

我自己电脑有点老了，pytorch不支持了，然后我在AWS上开了一个Deep Learning的AMI实例，而且成功运行remote.py了

$ python remote.py
remote server starting up on 0.0.0.0 port 14782
AI server starting up on 127.0.0.1 port 7479
wating for client.

但本地依然直接pytorch报错，说显卡的CUDA版本太旧。

(venv) E:\Dropbox\MajsoulAI>python main.py --remote_ip {我AWS的IP}
E:\Dropbox\MajsoulAI\venv\lib\site-packages\torch\cuda\__init__.py:81: UserWarning:
    Found GPU0 Quadro K600 which is of cuda capability 3.0.
    PyTorch no longer supports this GPU because it is too old.
    The minimum cuda capability that we support is 3.5.

  warnings.warn(old_gpu_warn % (d, name, major, capability[1]))
Traceback (most recent call last):
  File "E:\Dropbox\MajsoulAI\main.py", line 747, in <module>
    MainLoop(isRemoteMode=True, remoteIP=args.remote_ip, level=level)
  File "E:\Dropbox\MajsoulAI\main.py", line 665, in MainLoop
    aiWrapper = AIWrapper()
  File "E:\Dropbox\MajsoulAI\main.py", line 71, in __init__
    super().__init__()
  File "E:\Dropbox\MajsoulAI\majsoul_wrapper\action\action.py", line 170, in __init__
    self.classify = Classify()
  File "E:\Dropbox\MajsoulAI\majsoul_wrapper\action\classifier.py", line 95, in __init__
    self.__call__(np.ones((32, 32, 3), dtype=np.uint8))  # load cache
  File "E:\Dropbox\MajsoulAI\majsoul_wrapper\action\classifier.py", line 102, in __call__
    _, predicted = torch.max(self.model(img), 1)
  File "E:\Dropbox\MajsoulAI\venv\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "E:\Dropbox\MajsoulAI\majsoul_wrapper\action\classifier.py", line 79, in forward
    x = self.pool(F.relu(self.conv1(x)))
  File "E:\Dropbox\MajsoulAI\venv\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "E:\Dropbox\MajsoulAI\venv\lib\site-packages\torch\nn\modules\conv.py", line 423, in forward
    return self._conv_forward(input, self.weight)
  File "E:\Dropbox\MajsoulAI\venv\lib\site-packages\torch\nn\modules\conv.py", line 419, in _conv_forward
    return F.conv2d(input, weight, self.bias, self.stride,
RuntimeError: CUDA error: no kernel image is available for execution on the device

如果是远程模式的话，有没有办法跳过这个检测？

747929791 commented 3 years ago

Pytorch只有majsoul_wrapper用到了，用于图像检测+出牌，AI没有用pytorch。如果cuda不能用的话，在majsoul_wrapper\action\classifier.py里把device改成纯cpu模式运行不用cuda试试？

jacksong18 commented 3 years ago

这么改了之后这个问题解决了，不过又有别的问题，我另开个issue吧。

747929791 / MajsoulAI

因为本地GPU太旧，连用远程模式也会出错误 #21