andersbll / cudarray

CUDA-based NumPy
MIT License
234 stars 61 forks source link

./include/cudarray/common.hpp:95: an illegal memory access was encountered #77

Open AmmarkoV opened 7 years ago

AmmarkoV commented 7 years ago

While trying to run the examples "python test.py"

I get the following output

True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
False
True
False
True
True
(3.010663168391317, 3.009093)
(9.9961403059886251, 10.000256)
(0.49978761016193529, 0.49983668)
(0.28849205059485095, 0.28868541)
(9.9937232616873413, 10.004809)
(11.544129235300824, 11.543992)
Traceback (most recent call last):
  File "test.py", line 494, in <module>
    run()
  File "test.py", line 488, in run
    test_reduce()
  File "test.py", line 347, in test_reduce
    print(np.allclose(c_np, np.array(c_ca)))
  File "/usr/local/lib/python2.7/dist-packages/cudarray-0.1.dev0-py2.7-linux-x86_64.egg/cudarray/cudarray.py", line 42, in __array__
    self._data.to_numpy(np_array)
  File "cudarray/wrap/array_data.pyx", line 24, in cudarray.wrap.array_data.ArrayData.to_numpy (./cudarray/wrap/array_data.cpp:1791)
  File "cudarray/wrap/cudart.pyx", line 12, in cudarray.wrap.cudart.cudaCheck (./cudarray/wrap/cudart.cpp:970)
ValueError: an illegal memory access was encountered
terminate called after throwing an instance of 'std::runtime_error'
  what():  ./include/cudarray/common.hpp:95: an illegal memory access was encountered

I think this is the same issue as : https://github.com/andersbll/neural_artistic_style/issues/48 and https://github.com/andersbll/neural_artistic_style/issues/21

nvidia-smi output is the following so 1.5GB total the images I am trying to run are 200x130 and 200x277

Sat Feb  4 16:24:24 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 367.57                 Driver Version: 367.57                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 670M    Off  | 0000:01:00.0     N/A |                  N/A |
|  0%   50C    P0    N/A /  N/A |      4MiB /  1474MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0                  Not Supported                                         |
+-----------------------------------------------------------------------------+

Any help is welcome :)

AmmarkoV commented 7 years ago

Tried it again on a GPU GTX-970 with 4GB RAM and have the same behaviour even when resizing images/skrik.jpg to 288x400 and for a 300x200 input..

Getting the exact same behaviour from python test.py

At peak this is what NVIDIA-SMI is printing so it is 0.6/4GB usage so memory is not an issue..

Thu Feb 23 18:07:44 2017
+-----------------------------------------------------------------------------+ | NVIDIA-SMI 367.57 Driver Version: 367.57 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce GTX 970 Off | 0000:01:00.0 On | N/A | | 29% 66C P2 90W / 163W | 638MiB / 4028MiB | 50% Default | +-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 1226 G /usr/lib/xorg/Xorg 125MiB | | 0 2056 G /usr/lib/firefox/plugin-container 2MiB | | 0 2195 C python 507MiB | +-----------------------------------------------------------------------------+ GPU 0000:01:00.0: Detected Critical Xid Error

petrsmid commented 7 years ago

Reducing size of the a_np array worked for me: Use e.g. size 256 instead of 1024: a_np = np.random.normal(size=(256,))