nbia-astro / skeletor

Parallel PIC code written in Python and based on the skeleton codes provided by PICKSC
GNU General Public License v3.0
1 stars 0 forks source link

The code doesn't compile with Clang (LLVM) #115

Closed tobson closed 7 years ago

tobson commented 7 years ago

The error mentioned in the first point in #111 actually has nothing to with OpenMP. I just tried compiling the code with LLVM 3.9's Clang (which supports OpenMP) and I got the same error. I suspect the code right now simply doesn't compile with Clang.

tobson commented 7 years ago

@tberlok Is this right? Have you recently compiled the code with Clang?

tberlok commented 7 years ago

Just checked and reproduced this error with LLVM clang v. 3.8.1. There seems to be some information online about issues with inline functions and clang, see

tobson commented 7 years ago

I suspect Cython is probably at fault here and GCC just happens to be a little more lenient than Clang. Where Clang throws an error,

skeletor/cython/push_and_deposit.c:1928:8: error: 'inline' can only appear on
      functions
static CYTHON_INLINE int (*__pyx_f_8skeletor_6cython_17particle_boundary...

GCC gives merely a warning,

skeletor/cython/push_and_deposit.c:1947:28: warning: variable '__pyx_f_8skeletor_6cython_17particle_boundary_calculate_ihole_cdef' declared 'inline'
 static CYTHON_INLINE int (*__pyx_f_8skeletor_6cython_17particle_boundary_calculate_ihole_cdef)(struct __pyx_t_8skeletor_6cython_5types_particle_t, __Pyx_memviewslice, struct __pyx_obj_8skeletor_6cython_5types_grid_t *, int, int); /*proto*/
tberlok commented 7 years ago

It compiles and the tests work if one removes the word 'inline' in the .pxd files (deposit.pxd, particle_boundary.pxd & particle_push.pxd). The question is then if push_and_deposit becomes much slower by doing so.

tobson commented 7 years ago

As in C++, we probably need to put the entire definition of the inline functions into pxd (i.e. header) files, not just their declaration, see http://cython.readthedocs.io/en/latest/src/tutorial/pxd_files.html

tobson commented 7 years ago

GCC was apparently right to give a warning: with the inline functions' body in pxd files, the code is almost twice as fast if my benchmarks on my iMac 5K are to be believed. The following are timings of tests/test_fastwave.py with nperiods=10.0 and OMP_NUM_THREADS=1.

Commit                 / Compiler
------------------------------------------------------------
0407070 (inline-pxd)   / Clang 3.9.1
real    0m19.396s   |   user  0m19.324s   |   sys   0m0.046s
real    0m19.335s   |   user  0m19.266s   |   sys   0m0.044s
real    0m19.345s   |   user  0m19.278s   |   sys   0m0.043s
------------------------------------------------------------
0407070 (inline-pxd)   / GCC 6.3.0
real    0m19.007s   |   user  0m18.929s   |   sys   0m0.050s
real    0m19.027s   |   user  0m18.960s   |   sys   0m0.044s
real    0m19.141s   |   user  0m19.069s   |   sys   0m0.047s
------------------------------------------------------------
5632aaa (master)       / GCC 6.3.0
real    0m33.926s   |   user  0m33.844s   |   sys   0m0.054s
real    0m34.274s   |   user  0m34.166s   |   sys   0m0.067s
real    0m33.893s   |   user  0m33.806s   |   sys   0m0.056s