kevaday / alphazero-general

A fast, generalized, and modified implementation of Deepmind's distinguished AlphaZero in PyTorch.
MIT License
66 stars 21 forks source link

Running cpuonly on windows gives "pinned memory requires CUDA" #2

Closed EngrStudent closed 3 years ago

EngrStudent commented 3 years ago

I don't have a fancy gpu on this computer.

I used this code to create my repo:

conda create --name agz_kevaday
conda activate agz_kevaday
conda install pytorch torchvision torchaudio cpuonly -c pytorch
conda install -c anaconda numpy cython 
conda install -c conda-forge tensorboard tensorboardx choix

I navigated to the main alphazero-general directory and then executed this: python -m alphazero.envs.tictactoe.train

Here is a names-redacted version of the output:

Because of batching, it can take a long time before any games finish.
------ITER 1------
Warmup: random policy and value
Traceback (most recent call last):
  File "C:\Users\_redacted_\Anaconda3\envs\agz_kevaday\lib\runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\_redacted_\Anaconda3\envs\agz_kevaday\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\_redacted_\Documents\Personal\alphazero-general\alphazero\envs\tictactoe\train.py", line 32, in <module>
    c.learn()
  File "C:\Users\_redacted_\Documents\Personal\alphazero-general\alphazero\Coach.py", line 180, in learn
    self.generateSelfPlayAgents()
  File "C:\Users\_redacted_\Documents\Personal\alphazero-general\alphazero\Coach.py", line 226, in generateSelfPlayAgents
    self.input_tensors[i].pin_memory()
RuntimeError: Pinned memory requires CUDA. PyTorch splits its backend into two shared libraries: a CPU library and a CUDA library; this error has occurred because you are trying to use some CUDA functionality, but the CUDA library has not been loaded by the dynamic linker for some reason.  The CUDA library MUST be loaded, EVEN IF you don't directly use any symbols from the CUDA library! One common culprit is a lack of -INCLUDE:?warp_size@cuda@at@@YAHXZ in your link arguments; many dynamic linkers will delete dynamic library dependencies if you don't depend on any of their symbols.  You can check if this has occurred by using link on your binary to see if there is a dependency on *_cuda.dll library.

Given the error, I think the code doesn't work for cpu-only. It seems to be saying "CUDA required".

kevaday commented 3 years ago

Thank you for reporting this error, it seems I have overlooked support for CPU-only in certain places. I will fix this as soon as I can.

kevaday commented 3 years ago

@EngrStudent I just pushed a fix, let me know if it works.