maweigert / spimagine

GPU accelerated volume rendering / processing in Python
BSD 3-Clause "New" or "Revised" License
117 stars 17 forks source link

Compile errors in convolve2d kernel (on Macbook Pro Retina 2012) #9

Closed VolkerH closed 7 years ago

VolkerH commented 7 years ago

On my Macbook Pro Retina (2012) I receive build errors from the OpenCL compiler for the following statements in convolve_2d.c float val = exp(-5.f*(ht-Nh/2.)*(ht-Nh/2.)/Nh/Nh)

When I change 2. to 2.f the convole kernel compiles, but I'm not sure the results are correct.

--------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-4-2681083cedf7> in <module>()
      6
      7 # render the data and returns the widget
----> 8 w = volshow(data)
      9
     10 # manipulate the render states, e.g. rotation and colormap

/Users/volkerhilsenstein/Dropbox/Volker/Work/GitHub/spimagine/spimagine/gui/volshow.py in volshow(data, autoscale, stackUnits, blocking, cmap, raise_window)
    204        num = 1
    205
--> 206    window = volfig(num)
    207    logger.debug("volfig: %s s " % (time() - t))
    208    t = time()

/Users/volkerhilsenstein/Dropbox/Volker/Work/GitHub/spimagine/spimagine/gui/volshow.py in volfig(num, raise_window)
     80        app.volfigs.pop(num)
     81    else:
---> 82        window = MainWidget()
     83
     84    #make num the last window

/Users/volkerhilsenstein/Dropbox/Volker/Work/GitHub/spimagine/spimagine/gui/mainwidget.py in __init__(self, parent)
    101        self.initActions()
    102
--> 103        self.glWidget = GLWidget(self)
    104        self.glWidget.setTransform(self.transform)
    105

/Users/volkerhilsenstein/Dropbox/Volker/Work/GitHub/spimagine/spimagine/gui/glwidget.py in __init__(self, parent, N_PREFETCH, **kwargs)
    103
    104        self.renderer = VolumeRenderer((spimagine.config.__DEFAULTWIDTH__,
--> 105                                        spimagine.config.__DEFAULTWIDTH__))
    106
    107        self.renderer.set_projection(mat4_perspective(60, 1., .1, 100))

/Users/volkerhilsenstein/Dropbox/Volker/Work/GitHub/spimagine/spimagine/volumerender/volumerender.py in __init__(self, size)
    112                                "-cl-fast-relaxed-math",
    113                                "-cl-unsafe-math-optimizations",
--> 114                                "-cl-mad-enable"])
    115        try:
    116            pass

/Users/volkerhilsenstein/anaconda3/lib/python3.6/site-packages/gputools/core/oclprogram.py in __init__(self, file_name, src_str, build_options, dev)
     38        self._kernel_dict = {}
     39        super(OCLProgram,self).__init__(self._dev.context,src_str)
---> 40        self.build(options = build_options)
     41
     42    def run_kernel(self, name, global_size, local_size,*args,**kwargs):

/Users/volkerhilsenstein/anaconda3/lib/python3.6/site-packages/pyopencl/__init__.py in build(self, options, devices, cache_dir)
    436                        self._context, self._source, options_bytes, devices,
    437                        cache_dir=cache_dir, include_path=include_path),
--> 438                    options_bytes=options_bytes, source=self._source)
    439
    440            del self._context

/Users/volkerhilsenstein/anaconda3/lib/python3.6/site-packages/pyopencl/__init__.py in _build_and_catch_errors(self, build_func, options_bytes, source)
    471        # Python 3.2 outputs the whole list of currently active exceptions
    472        # This serves to remove one (redundant) level from that nesting.
--> 473        raise err
    474
    475    # }}}

RuntimeError: clBuildProgram failed: BUILD_PROGRAM_FAILURE -

Build on <pyopencl.Device 'HD Graphics 4000' on 'Apple' at 0x1024400>:

In file included from <program source>:16:
In file included from /Users/volkerhilsenstein/Dropbox/Volker/Work/GitHub/spimagine/spimagine/volumerender/kernels/volume_kernel.cl:13:
/Users/volkerhilsenstein/Dropbox/Volker/Work/GitHub/spimagine/spimagine/volumerender/kernels/utils.cl:41:5: warning: no previous prototype for function 'intersectBox'
int intersectBox(float4 r_o, float4 r_d, float4 boxmin, float4 boxmax, float *tnear, float *tfar)
    ^
/Users/volkerhilsenstein/Dropbox/Volker/Work/GitHub/spimagine/spimagine/volumerender/kernels/utils.cl:63:8: warning: no previous prototype for function 'mult'
float4 mult(__constant float* M, float4 v){
       ^
/Users/volkerhilsenstein/Dropbox/Volker/Work/GitHub/spimagine/spimagine/volumerender/kernels/utils.cl:74:30: warning: type specifier missing, defaults to 'int'
__kernel void foo(__global * bar){}
                  ~~~~~~~~   ^
In file included from <program source>:16:
/Users/volkerhilsenstein/Dropbox/Volker/Work/GitHub/spimagine/spimagine/volumerender/kernels/volume_kernel.cl:110:8: warning: unused variable 'entropy'
  uint entropy = (uint)( 6779514*length(orig) + 6257327*length(direc) );
       ^
/Users/volkerhilsenstein/Dropbox/Volker/Work/GitHub/spimagine/spimagine/volumerender/kernels/volume_kernel.cl:279:8: warning: unused variable 'entropy'
  uint entropy = (uint)( 6779514*length(orig) + 6257327*length(direc) );
       ^
In file included from <program source>:18:
In file included from /Users/volkerhilsenstein/Dropbox/Volker/Work/GitHub/spimagine/spimagine/volumerender/kernels/iso_kernel.cl:13:
/Users/volkerhilsenstein/Dropbox/Volker/Work/GitHub/spimagine/spimagine/volumerender/kernels/convolve_2d.cl:14:7: warning: unused variable 'Ny'
  int Ny = get_global_size(1);
      ^
/Users/volkerhilsenstein/Dropbox/Volker/Work/GitHub/spimagine/spimagine/volumerender/kernels/convolve_2d.cl:88:19: error: call to '__fast_relax_exp' is ambiguous
      float val = exp(-5.f*(ht-Nh/2.)*(ht-Nh/2.)/Nh/Nh);
                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/System/Library/Frameworks/OpenCL.framework/Versions/A/lib/clang/3.2/include/cl_kernel.h:4496:22: note: expanded from macro 'exp'
    #define exp(__x) __fast_relax_exp(__x)
                     ^~~~~~~~~~~~~~~~
/System/Library/Frameworks/OpenCL.framework/Versions/A/lib/clang/3.2/include/cl_kernel.h:4494:30: note: candidate function
    __CLFN_FD_1FD_FAST_RELAX(__fast_relax_exp, native_exp, __cl_exp);
                             ^
/System/Library/Frameworks/OpenCL.framework/Versions/A/lib/clang/3.2/include/cl_kernel.h:392:27: note: expanded from macro '__CLFN_FD_1FD_FAST_RELAX'
inline float __OVERLOAD__ _name(float x) { return _native_name(x); } \
                          ^
/System/Library/Frameworks/OpenCL.framework/Versions/A/lib/clang/3.2/include/cl_kernel.h:4494:30: note: candidate function
    __CLFN_FD_1FD_FAST_RELAX(__fast_relax_exp, native_exp, __cl_exp);
                             ^
/System/Library/Frameworks/OpenCL.framework/Versions/A/lib/clang/3.2/include/cl_kernel.h:393:28: note: expanded from macro '__CLFN_FD_1FD_FAST_RELAX'
inline float2 __OVERLOAD__ _name(float2 x) { return _native_name(x); } \
                           ^
/System/Library/Frameworks/OpenCL.framework/Versions/A/lib/clang/3.2/include/cl_kernel.h:4494:30: note: candidate function
    __CLFN_FD_1FD_FAST_RELAX(__fast_relax_exp, native_exp, __cl_exp);
                             ^
/System/Library/Frameworks/OpenCL.framework/Versions/A/lib/clang/3.2/include/cl_kernel.h:394:28: note: expanded from macro '__CLFN_FD_1FD_FAST_RELAX'
inline float3 __OVERLOAD__ _name(float3 x) { return _native_name(x); } \
                           ^
/System/Library/Frameworks/OpenCL.framework/Versions/A/lib/clang/3.2/include/cl_kernel.h:4494:30: note: candidate function
    __CLFN_FD_1FD_FAST_RELAX(__fast_relax_exp, native_exp, __cl_exp);
                             ^
/System/Library/Frameworks/OpenCL.framework/Versions/A/lib/clang/3.2/include/cl_kernel.h:395:28: note: expanded from macro '__CLFN_FD_1FD_FAST_RELAX'
inline float4 __OVERLOAD__ _name(float4 x) { return _native_name(x); } \
                           ^
/System/Library/Frameworks/OpenCL.framework/Versions/A/lib/clang/3.2/include/cl_kernel.h:4494:30: note: candidate function
    __CLFN_FD_1FD_FAST_RELAX(__fast_relax_exp, native_exp, __cl_exp);
                             ^
/System/Library/Frameworks/OpenCL.framework/Versions/A/lib/clang/3.2/include/cl_kernel.h:396:28: note: expanded from macro '__CLFN_FD_1FD_FAST_RELAX'
inline float8 __OVERLOAD__ _name(float8 x) { return _native_name(x); } \
                           ^
/System/Library/Frameworks/OpenCL.framework/Versions/A/lib/clang/3.2/include/cl_kernel.h:4494:30: note: candidate function
    __CLFN_FD_1FD_FAST_RELAX(__fast_relax_exp, native_exp, __cl_exp);
                             ^
/System/Library/Frameworks/OpenCL.framework/Versions/A/lib/clang/3.2/include/cl_kernel.h:397:29: note: expanded from macro '__CLFN_FD_1FD_FAST_RELAX'
inline float16 __OVERLOAD__ _name(float16 x){ return _native_name(x); }
                            ^
In file included from <program source>:18:
In file included from /Users/volkerhilsenstein/Dropbox/Volker/Work/GitHub/spimagine/spimagine/volumerender/kernels/iso_kernel.cl:13:
/Users/volkerhilsenstein/Dropbox/Volker/Work/GitHub/spimagine/spimagine/volumerender/kernels/convolve_2d.cl:127:20: error: call to '__fast_relax_exp' is ambiguous
       float val = exp(-5.f*(ht-Nh/2.)*(ht-Nh/2.)/Nh/Nh);
                   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/System/Library/Frameworks/OpenCL.framework/Versions/A/lib/clang/3.2/include/cl_kernel.h:4496:22: note: expanded from macro 'exp'
    #define exp(__x) __fast_relax_exp(__x)
                     ^~~~~~~~~~~~~~~~
/System/Library/Frameworks/OpenCL.framework/Versions/A/lib/clang/3.2/include/cl_kernel.h:4494:30: note: candidate function
    __CLFN_FD_1FD_FAST_RELAX(__fast_relax_exp, native_exp, __cl_exp);
                             ^
/System/Library/Frameworks/OpenCL.framework/Versions/A/lib/clang/3.2/include/cl_kernel.h:392:27: note: expanded from macro '__CLFN_FD_1FD_FAST_RELAX'
inline float __OVERLOAD__ _name(float x) { return _native_name(x); } \
                          ^
/System/Library/Frameworks/OpenCL.framework/Versions/A/lib/clang/3.2/include/cl_kernel.h:4494:30: note: candidate function
    __CLFN_FD_1FD_FAST_RELAX(__fast_relax_exp, native_exp, __cl_exp);
                             ^
/System/Library/Frameworks/OpenCL.framework/Versions/A/lib/clang/3.2/include/cl_kernel.h:393:28: note: expanded from macro '__CLFN_FD_1FD_FAST_RELAX'
inline float2 __OVERLOAD__ _name(float2 x) { return _native_name(x); } \
                           ^
/System/Library/Frameworks/OpenCL.framework/Versions/A/lib/clang/3.2/include/cl_kernel.h:4494:30: note: candidate function
    __CLFN_FD_1FD_FAST_RELAX(__fast_relax_exp, native_exp, __cl_exp);
                             ^
/System/Library/Frameworks/OpenCL.framework/Versions/A/lib/clang/3.2/include/cl_kernel.h:394:28: note: expanded from macro '__CLFN_FD_1FD_FAST_RELAX'
inline float3 __OVERLOAD__ _name(float3 x) { return _native_name(x); } \
                           ^
/System/Library/Frameworks/OpenCL.framework/Versions/A/lib/clang/3.2/include/cl_kernel.h:4494:30: note: candidate function
    __CLFN_FD_1FD_FAST_RELAX(__fast_relax_exp, native_exp, __cl_exp);
                             ^
/System/Library/Frameworks/OpenCL.framework/Versions/A/lib/clang/3.2/include/cl_kernel.h:395:28: note: expanded from macro '__CLFN_FD_1FD_FAST_RELAX'
inline float4 __OVERLOAD__ _name(float4 x) { return _native_name(x); } \
                           ^
/System/Library/Frameworks/OpenCL.framework/Versions/A/lib/clang/3.2/include/cl_kernel.h:4494:30: note: candidate function
    __CLFN_FD_1FD_FAST_RELAX(__fast_relax_exp, native_exp, __cl_exp);
                             ^
/System/Library/Frameworks/OpenCL.framework/Versions/A/lib/clang/3.2/include/cl_kernel.h:396:28: note: expanded from macro '__CLFN_FD_1FD_FAST_RELAX'
inline float8 __OVERLOAD__ _name(float8 x) { return _native_name(x); } \
                           ^
/System/Library/Frameworks/OpenCL.framework/Versions/A/lib/clang/3.2/include/cl_kernel.h:4494:30: note: candidate function
    __CLFN_FD_1FD_FAST_RELAX(__fast_relax_exp, native_exp, __cl_exp);
                             ^
/System/Library/Frameworks/OpenCL.framework/Versions/A/lib/clang/3.2/include/cl_kernel.h:397:29: note: expanded from macro '__CLFN_FD_1FD_FAST_RELAX'
inline float16 __OVERLOAD__ _name(float16 x){ return _native_name(x); }
                            ^
In file included from <program source>:18:
In file included from /Users/volkerhilsenstein/Dropbox/Volker/Work/GitHub/spimagine/spimagine/volumerender/kernels/iso_kernel.cl:14:
/Users/volkerhilsenstein/Dropbox/Volker/Work/GitHub/spimagine/spimagine/volumerender/kernels/occlusion.cl:21:18: warning: comparison of integers of different signs: 'int' and 'uint' (aka 'unsigned int')
  for(int i = 0;i<number_points;++i){
                ~^~~~~~~~~~~~~~
/Users/volkerhilsenstein/Dropbox/Volker/Work/GitHub/spimagine/spimagine/volumerender/kernels/occlusion.cl:56:18: warning: comparison of integers of different signs: 'int' and 'uint' (aka 'unsigned int')
  for(int i = 0;i<number_points;++i){
                ~^~~~~~~~~~~~~~
In file included from <program source>:18:
/Users/volkerhilsenstein/Dropbox/Volker/Work/GitHub/spimagine/spimagine/volumerender/kernels/iso_kernel.cl:54:10: warning: unused variable 'direc0'
  float4 direc0, direc;
         ^
/Users/volkerhilsenstein/Dropbox/Volker/Work/GitHub/spimagine/spimagine/volumerender/kernels/iso_kernel.cl:278:10: warning: unused variable 'direc0'
  float4 direc0, direc;
         ^
/Users/volkerhilsenstein/Dropbox/Volker/Work/GitHub/spimagine/spimagine/volumerender/kernels/iso_kernel.cl:409:18: warning: comparison of integers of different signs: 'uint' (aka 'unsigned int') and 'int'
  for(uint i=0; i<=maxBisect; i++) {
                ~^ ~~~~~~~~~

(options: -I /Users/volkerhilsenstein/Dropbox/Volker/Work/GitHub/spimagine/spimagine/volumerender/kernels/ -D maxSteps=200 -cl-finite-math-only -cl-fast-relaxed-math -cl-unsafe-math-optimizations -cl-mad-enable -I /Users/volkerhilsenstein/anaconda3/lib/python3.6/site-packages/pyopencl/cl)
(source saved as /var/folders/s0/k5wgfwws0_72w111cmrwp5l40000gp/T/tmp1suge2st.cl)
VolkerH commented 7 years ago

I just realize dthat this might be a Macbook specific problem as there are both Intel HD Graphics 4000 1536 MB and NVIDIA GeForce GT 650M 1024 MB onboard. It appears that the OpenCL should be compiled for the NVIDIA but gets compiled for the Intel HD Graphics 4000.

maweigert commented 7 years ago

Actually, it's a compiler (OpenCL implementation) specific problem, in that some compilers are very sensitive to double/float casting (e.g. on my mac here, the compiler never throws error, yet in linux it does). The error you mentioned got fixed in on of the latest commits. Could you do a

pip install git+https://github.com/maweigert/spimagine

and see if the error persists? And which python you're using?

VolkerH commented 7 years ago

Thanks for the quick reply. To answer the last question first: I just installed the most current Anaconda to test spimagine as I seem to have some insurmountable problems installing PyQt5 on my Python 2.7 installation.
Details:

[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)] on darwin

I just did the pip install from the git as you suggested. No compiler errors. The compiler output is supressed though (I expect that many of the warnings I observed earlier are still created) and there is a complaint about a missing .spimagine folder in my home directory.

The code snippet from the readme.md pasted into iPython produces a box with a gradient and I can interact with it. So it appears to be working.

However, OpenCL appears to use the Intel On-Chip GPU rather than the more powerful NVIDIA, at lease according to the output on the console. Not sure how I can force it to use NVIDIA, will google for it.


<pyopencl.Device 'HD Graphics 4000' on 'Apple' at 0x1024400>
/Users/vXXXXn/anaconda3/lib/python3.6/site-packages/pyopencl/cffi_cl.py:1476: CompilerWarning: Non-empty compiler output encountered. Set the environment variable PYOPENCL_COMPILER_OUTPUT=1 to see more.
  "to see more.", CompilerWarning)
[Errno 2] No such file or directory: '/Users/vXXXXn/.spimagine'
<pyopencl.Device 'HD Graphics 4000' on 'Apple' at 0x1024400>
Qt WebEngine seems to be initialized from a plugin. Please set Qt::AA_ShareOpenGLContexts using QCoreApplication::setAttribute before constructing QGuiApplication.
/Users/vXXXXn/anaconda3/lib/python3.6/site-packages/pyopencl/cffi_cl.py:1476: CompilerWarning: Non-empty compiler output encountered. Set the environment variable PYOPENCL_COMPILER_OUTPUT=1 to see more.
  "to see more.", CompilerWarning)
INFO:spimagine.gui.glwidget | saving frame as scene.png
VolkerH commented 7 years ago

I guess it works :)

image
maweigert commented 7 years ago

Cool :)

As for using the nvidia card instead of the integrated GPU:

  1. check, which platform/device number it is with

    clinfo

    (e.g. for on my macbook, the nvidia is the second card on the "Apple" platform)

  2. create a file $~/spimagine with the content

    id_platform = 0  
    id_device = 1 

    where e.g. "id_device" points to the GPU to be used. (On my MacBook i have an Intel Iris and a NVidia in that order and thus "id_device=0" would be the iris, and "id_device=1" the nvidia)

VolkerH commented 7 years ago

Just to follow up with your last suggestion. clinfo -l gives me this:

+-- Device #0: Intel(R) Core(TM) i7-3615QM CPU @ 2.30GHz
 +-- Device #1: HD Graphics 4000
 `-- Device #2: GeForce GT 650M

Playing with different id_device numbers in ~/.spimagine

So id_device = 1 seems to be the best option on my machine.

maweigert commented 7 years ago

Yes. This is the current behaviour as any non GPU device (i.e. the Intel Core CPU) will be ignored and id_device=x therefore corresponds to the x+1-st GPU device as listed by clinfo. In your case id_device=1 is the second GPU, which is the Nvidia. Furthermore, in your second example, you get two different devices listed because spimagine imports gputools first and the latter is using it's own default GPU (in you case the HD4000). You could change that by creating a .gputools file on your home with the same settings id_device=1.

VolkerH commented 7 years ago

Ok that makes sense. However, even with a ~/.gputools file specifying the same id_device=1 I receive this output when pasting the sample code:

<pyopencl.Device 'HD Graphics 4000' on 'Apple' at 0x1024400>
/Users/vXXXn/anaconda3/lib/python3.6/site-packages/pyopencl/cffi_cl.py:1476: CompilerWarning: Non-empty compiler output encountered. Set the environment variable PYOPENCL_COMPILER_OUTPUT=1 to see more.
  "to see more.", CompilerWarning)
<pyopencl.Device 'GeForce GT 650M' on 'Apple' at 0x1022700>
maweigert commented 7 years ago

Hm, this is weird. Could you run the following from command line and check the output?

python -c "from gputools import init_device;init_device(id_platform=0,id_device=1)"
VolkerH commented 7 years ago
python -c "from gputools import init_device;init_device(id_platform=0,id_device=1)"
<pyopencl.Device 'HD Graphics 4000' on 'Apple' at 0x1024400>
/Users/vXXXXn/anaconda3/lib/python3.6/site-packages/pyopencl/cffi_cl.py:1476: CompilerWarning: Non-empty compiler output encountered. Set the environment variable PYOPENCL_COMPILER_OUTPUT=1 to see more.
  "to see more.", CompilerWarning)
<pyopencl.Device 'GeForce GT 650M' on 'Apple' at 0x1022700>
maweigert commented 7 years ago

Strange, I cannot reproduce. How is your config file (cat ~/.gputools) looking like?

VolkerH commented 7 years ago
cat ~/.gputools
id_device = 1

I also tried with id_platform = 0 as the first line (identical to my .spimagine file) ... makes no difference

maweigert commented 7 years ago

could you install the latest version and see if it persists?

pip install -U --no-deps git+https://github.com/maweigert/gputools@develop
VolkerH commented 7 years ago

Hi, this latest change seems to make a difference, now it reports the GeForce twice and no longer the HD 4000

pip install -U --no-deps git+https://github.com/maweigert/gputools@develop
Collecting git+https://github.com/maweigert/gputools@develop
  Cloning https://github.com/maweigert/gputools (to develop) to /private/var/folders/s0/k5wgfwws0_72w111cmrwp5l40000gp/T/pip-zr2lresr-build
Installing collected packages: gputools
  Found existing installation: gputools 0.2.2
    Uninstalling gputools-0.2.2:
      Successfully uninstalled gputools-0.2.2
  Running setup.py install for gputools ... done
Successfully installed gputools-0.2.2
mac-almf4:pyflann volkerhilsenstein$ python -c "from gputools import init_device;init_device(id_platform=0,id_device=1)"
<pyopencl.Device 'GeForce GT 650M' on 'Apple' at 0x1022700>
/Users/volkerhilsenstein/anaconda3/lib/python3.6/site-packages/pyopencl/cffi_cl.py:1476: CompilerWarning: Non-empty compiler output encountered. Set the environment variable PYOPENCL_COMPILER_OUTPUT=1 to see more.
  "to see more.", CompilerWarning)
<pyopencl.Device 'GeForce GT 650M' on 'Apple' at 0x1022700>
maweigert commented 7 years ago

Ok! Actually that was a bug in the pypi version that got introduced by porting the code to be python3 compatible (where the semantics of configparser changed). I'll update the pypi distro. Thanks for finding that out!