nguyen1j / py-fcm

Automatically exported from code.google.com/p/py-fcm
0 stars 0 forks source link

test_hdp.py throws threadSafeInit error when GPUs are enabled #25

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1.  Comment line 17 of test_hdp.py to disable CPU (by the way, I think there's 
a typo -- it says "use gpu")
2.  Uncomment line 18 of test_hdp.py to enable two GPU devices (my machine has 
2 Tesla GPUs)
3.  python test_hdp.py

Here is the output:
$ python test_hdp.py
starting GPU enabled MCMC
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/dpmix/gpuworker.py", line 77, in <module>
    gutil.threadSafeInit(dev_num)
AttributeError: 'module' object has no attribute 'threadSafeInit'

What version of the product are you using? On what operating system?
This is on Ubuntu 12.04.  I believe I'm using the latest version of all of 
py-fcm's dependencies.  I installed gpustats from the latest code on github 
earlier today.

Original issue reported on code.google.com by erin.sim...@gmail.com on 5 Dec 2013 at 7:01

GoogleCodeExporter commented 8 years ago
What was device set to?  Device should be either and integer representing the 
card, a list of integers, None (default to first device) or False (no GPU 
support).  commenting out line 17 would leave device undefined.

Also what gpustats library did you install?  the latest version 
(git@github.com:dukestats/gpustats.git)  has the gpustats.util package

Original comment by Jacob.Frelinger@gmail.com on 5 Dec 2013 at 10:19

GoogleCodeExporter commented 8 years ago
>What was device set to? 
device = [0,1]

> Also what gpustats library did you install?
I installed from https://github.com/dukestats/gpustats/archive/master.zip which 
I downloaded yesterday.  It contains gpustats/util.py.  I can import 
gpustats.util when running python from the console.

FWIW, here is the output of nvidia-xconfig --query-gpu-info:

Number of GPUs: 3

GPU #0:
  Name      : GeForce 9500 GT
  PCI BusID : PCI:1:0:0

  Number of Display Devices: 1

  Display Device 0 (CRT-1):
      EDID Name             : DELL 1702FP
      Minimum HorizSync     : 30.000 kHz
      Maximum HorizSync     : 80.000 kHz
      Minimum VertRefresh   : 56 Hz
      Maximum VertRefresh   : 76 Hz
      Maximum PixelClock    : 140.000 MHz
      Maximum Width         : 1280 pixels
      Maximum Height        : 1024 pixels
      Preferred Width       : 1280 pixels
      Preferred Height      : 1024 pixels
      Preferred VertRefresh : 60 Hz
      Physical Width        : 340 mm
      Physical Height       : 270 mm

GPU #1:
  Name      : Tesla C1060
  PCI BusID : PCI:2:0:0

  Number of Display Devices: 0

GPU #2:
  Name      : Tesla C1060
  PCI BusID : PCI:3:0:0

  Number of Display Devices: 0

Original comment by erin.sim...@gmail.com on 5 Dec 2013 at 10:45

GoogleCodeExporter commented 8 years ago
when you import gpustats.util what does 'dir(gpustats.util)' return?  
threadSafeInit should be there...

CC'ing Andrew Cron who wrote a bunch of the gpustats code.

Original comment by Jacob.Frelinger@gmail.com on 9 Dec 2013 at 6:08

GoogleCodeExporter commented 8 years ago
Thanks, that code helped me troubleshoot.  It looks like the gpustats that was 
installed globally was an older version that not have threadSafeInit. I was 
assuming gpustats's setup.py would overwrite the existing installation, but it 
doesn't appear to.  Sorry for the wild goose chase.

For posterity, here's the output using the globally installed gpustats...

Python 2.7.3 (default, Sep 26 2013, 20:03:06)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import gpustats.util
>>> dir(gpustats.util)
['DeviceInfo', 'HALF_WARP', '__builtins__', '__doc__', '__file__', '__name__', 
'__package__', '_dev_attr', '_next_pow2', 'compute_shmem', 'drv', 'get_boxes', 
'next_multiple', 'np', 'pad_data', 'prep_ndarray', 'pymc_dist', 'random_cov', 
'tune_blocksize', 'unvech']

I was confused because I had an old version of gpustats installed, and "sudo 
python setup.py install" did not overwrite the package in /usr/local/lib:

 ~/python/gpustats-master$ sudo python setup.py install
[sudo] password for esimonds:
running install
running build
running config_cc
unifing config_cc, config, build_clib, build_ext, build commands --compiler 
options
running config_fc
unifing config_fc, config, build_clib, build_ext, build commands --fcompiler 
options
running build_py
running install_lib
running install_data
running install_egg_info
Removing /usr/local/lib/python2.7/dist-packages/gpustats-0.0.1.egg-info
Writing /usr/local/lib/python2.7/dist-packages/gpustats-0.0.1.egg-info
running install_clib
customize UnixCCompiler

I manually deleted the package in 
/usr/local/lib/python2.7/dist-packages/gpustats and reinstalled gpustats from 
github successfully:

$ sudo python ./setup.py install
running install
running build
running config_cc
unifing config_cc, config, build_clib, build_ext, build commands --compiler 
options
running config_fc
unifing config_fc, config, build_clib, build_ext, build commands --fcompiler 
options
running build_py
running install_lib
creating /usr/local/lib/python2.7/dist-packages/gpustats
copying build/lib.linux-x86_64-2.7/gpustats/compat.py -> 
/usr/local/lib/python2.7/dist-packages/gpustats
copying build/lib.linux-x86_64-2.7/gpustats/pdfs.py -> 
/usr/local/lib/python2.7/dist-packages/gpustats
copying build/lib.linux-x86_64-2.7/gpustats/multigpu.py -> 
/usr/local/lib/python2.7/dist-packages/gpustats
copying build/lib.linux-x86_64-2.7/gpustats/__init__.py -> 
/usr/local/lib/python2.7/dist-packages/gpustats
copying build/lib.linux-x86_64-2.7/gpustats/sampler.py -> 
/usr/local/lib/python2.7/dist-packages/gpustats
copying build/lib.linux-x86_64-2.7/gpustats/codegen.py -> 
/usr/local/lib/python2.7/dist-packages/gpustats
copying build/lib.linux-x86_64-2.7/gpustats/util.py -> 
/usr/local/lib/python2.7/dist-packages/gpustats
copying build/lib.linux-x86_64-2.7/gpustats/kernels.py -> 
/usr/local/lib/python2.7/dist-packages/gpustats
byte-compiling /usr/local/lib/python2.7/dist-packages/gpustats/compat.py to 
compat.pyc
byte-compiling /usr/local/lib/python2.7/dist-packages/gpustats/pdfs.py to 
pdfs.pyc
byte-compiling /usr/local/lib/python2.7/dist-packages/gpustats/multigpu.py to 
multigpu.pyc
byte-compiling /usr/local/lib/python2.7/dist-packages/gpustats/__init__.py to 
__init__.pyc
byte-compiling /usr/local/lib/python2.7/dist-packages/gpustats/sampler.py to 
sampler.pyc
byte-compiling /usr/local/lib/python2.7/dist-packages/gpustats/codegen.py to 
codegen.pyc
byte-compiling /usr/local/lib/python2.7/dist-packages/gpustats/util.py to 
util.pyc
byte-compiling /usr/local/lib/python2.7/dist-packages/gpustats/kernels.py to 
kernels.pyc
running install_data
creating /usr/local/lib/python2.7/dist-packages/gpustats/cufiles
copying gpustats/cufiles/support.cu -> 
/usr/local/lib/python2.7/dist-packages/gpustats/cufiles/
copying gpustats/cufiles/cpustub.cu -> 
/usr/local/lib/python2.7/dist-packages/gpustats/cufiles/
copying gpustats/cufiles/mvcaller.cu -> 
/usr/local/lib/python2.7/dist-packages/gpustats/cufiles/
copying gpustats/cufiles/univcaller.cu -> 
/usr/local/lib/python2.7/dist-packages/gpustats/cufiles/
copying gpustats/cufiles/transpose.cu -> 
/usr/local/lib/python2.7/dist-packages/gpustats/cufiles/
copying gpustats/cufiles/sample_discrete_logged.cu -> 
/usr/local/lib/python2.7/dist-packages/gpustats/cufiles/
copying gpustats/cufiles/sampleFromMeasureMedium.cu -> 
/usr/local/lib/python2.7/dist-packages/gpustats/cufiles/
copying gpustats/cufiles/sample_discrete.cu -> 
/usr/local/lib/python2.7/dist-packages/gpustats/cufiles/
creating /usr/local/lib/python2.7/dist-packages/gpustats/tests
copying gpustats/tests/test_samplers.py -> 
/usr/local/lib/python2.7/dist-packages/gpustats/tests/
copying gpustats/tests/test_pdfs.py -> 
/usr/local/lib/python2.7/dist-packages/gpustats/tests/
running install_egg_info
Writing /usr/local/lib/python2.7/dist-packages/gpustats-0.0.1.egg-info
running install_clib
customize UnixCCompiler

Now python finds threadSafeInit:

Python 2.7.3 (default, Sep 26 2013, 20:03:06)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import gpustats.util
>>> dir(gpustats.util)
['DeviceInfo', 'GPUarray_order', 'GPUarray_reshape', 'HALF_WARP', 'LA', 
'SourceModule', '__builtins__', '__doc__', '__file__', '__name__', 
'__package__', '_dev_attr', '_get_transpose_kernel', '_next_pow2', 
'_transpose', 'clean_all_contexts', 'compute_shmem', 
'context_dependent_memoize', 'drv', 'get_boxes', 'get_cufiles_path', 
'gpuarray', 'info', 'next_multiple', 'np', 'pad_data', 'pad_data_mult16', 
'prep_ndarray', 'pycuda', 'random_cov', 'threadSafeInit', 'transpose', 
'tune_blocksize', 'unvech']

And now the demo code works properly:

$ head -20 test_hdp.py
import glob
import numpy as np
import numpy.random as npr
import fcm
import fcm.statistics as stats
import pylab
import copy

if __name__ == '__main__':

    # HDPGMM settings
    # IMPORTANT: burnin and niter are set to extremely low values for use as a test case on non-CUDA machines
    nclusts = 128 # upper limit for number of components in truncated DP
    niter = 2000 # number of iterations to use for posterior distribution (manusript uses 2000)
    burnin = 20000 # number of burnin iterations (manuscript uses 20000)

    #device = False # use gpu
    device = [2] # this seems to use GPU devices 2 and 3, which is what I want
    verbose = 10 # report every 10 MCMC steps
    seed = 201 # random number seed

$ python test_hdp.py
starting GPU enabled MCMC
-20000
-19990

$ nvidia-smi
Mon Dec  9 11:31:06 2013
+------------------------------------------------------+
| NVIDIA-SMI 5.319.37   Driver Version: 319.37         |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce 9500 GT     Off  | 0000:01:00.0     N/A |                  N/A |
|100%   41C  N/A     N/A /  N/A |       32MB /  1023MB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla C1060         Off  | 0000:02:00.0     N/A |                  N/A |
| 35%   58C  N/A     N/A /  N/A |       55MB /  4095MB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
|   2  Tesla C1060         Off  | 0000:03:00.0     N/A |                  N/A |
| 35%   57C  N/A     N/A /  N/A |       86MB /  4095MB |     N/A      Default |
+-------------------------------+----------------------+----------------------+

Thanks for your help, Jacob.  I'll go play with some real data now

Original comment by erin.sim...@gmail.com on 9 Dec 2013 at 7:32

GoogleCodeExporter commented 8 years ago
Hooray! closing bug!  Good luck with the real data!

Original comment by Jacob.Frelinger@gmail.com on 9 Dec 2013 at 7:34