CEED / libCEED

CEED Library: Code for Efficient Extensible Discretizations
https://libceed.org
BSD 2-Clause "Simplified" License
203 stars 47 forks source link

CeedOklPath Error On CU Summit #71

Closed jeremylt closed 5 years ago

jeremylt commented 6 years ago

On CU's Summit, using Intel compilers, I get the following error when running the Fortran q-function and operator tests, for both ocl and omp. +CEED-OCCA error @ /home/jeth8984/libceed/backends/occa/ceed-occa-okl.c:58 CeedOklPath_Occa

Strangely, there is no similar error on ex1 for ocl or omp.

Full text of the error is here: Error Output Modules loaded is here: Modules List

This appears to be related to the issue discussed in PR #68

jedbrown commented 6 years ago

Is this still open? If so, better tag someone involved in that component?

jeremylt commented 6 years ago

It is still open - I haven't figured this out yet. @v-dobrev , @thilinarmtb do you have any thoughts on what may be going wrong?

jedbrown commented 6 years ago

Cc: @camierjs

camierjs commented 6 years ago

@jeremylt, could you produce an output of a failing test, setting export DBG=1 before your run? That should set the debug verbose mode, helping us perhaps with the CeedOklPath ones.

jeremylt commented 6 years ago

Sorry for the delay - here is what I got.

camierjs commented 6 years ago

With the t20-qfunction C test, you have the abs path:

[CeedOklPath] Current OKL is /home/jeth8984/libceed/tests/t20-qfunction.okl [CeedOklPath] Final OKL is /home/jeth8984/libceed/tests/t20-qfunction.okl

but with the fortran t20 and t30 tests and your compiler, you don't have any path (__FILE__ just gives you the filename):

[CeedOklPath] Current OKL is t20-qfunction-f.okl [CeedOklPath] Could NOT stat this OKL file: t20-qfunction-f.okl

This leads the CeedOklPath function to try the different locations: OCCA cache & libceed path:

[CeedOklPath] Trying occa://ceed/t20-qfunction-f.okl [CeedOklPath] Stating /home/jeth8984/.occa/libraries/ceed/t20-qfunction-f.okl [CeedOklPath] Could NOT stat OCCA cache: /home/jeth8984/.occa/libraries/ceed... [CeedOklPath] Trying fron libceed: /home/jeth8984/libceed/lib/okl/t20-qfunction-f.okl CeedOklPath_Occa Cannot find OKL file!

You should try to OCCA cache this kernel as a work-around.

Looking at Intel's manual, they don't add the full path.

jeremylt commented 6 years ago

Caching fixed the problem. Do we want to mark the issue as resolved or do we want to find a fix that does not involve caching? @jedbrown @camierjs

jedbrown commented 6 years ago

We want a solution that doesn't require manual caching. I don't know if that is a libCEED issue or an OCCA issue.

v-dobrev commented 6 years ago

In OCCA there is an abstract base class fileOpener that is used for searching for .okl files. It is quite convenient, I used it in MFEM to define url-handler for mfem-occa:// prefix. See url_handler.hpp and url_handler.cpp. Basically, this url-handler can use any prefix and it will search a list of paths to find the specified file. The list of directories is defined by the library/application (e.g. libceed) and it will also add directories from an environment variable (the name of the variable is controlled by the library/application), if defined.

jeremylt commented 6 years ago

Even with #134, I can confirm that this error still occurs, for t400-f, t500-f and t501-f.

jeremylt commented 5 years ago

After investigating, this issue is a matter of the ifort compiler failing to provide the absolute path when intended, rather than a libCEED issue. (See https://software.intel.com/en-us/fortran-compiler-developer-guide-and-reference-using-fpp-preprocessor-directives only the filename is provided, rather than the name + path provided by GCC) This affects any use of the macro __FILE__ in a Fortran file to provide the filepath to a QFunction for JiT, for the CUDA or OCCA backends.

Is it worth fighting Intel's nonstandard implementation and getting ifort to behave correctly for our test suite, or should we close this?

jedbrown commented 5 years ago

How about if our makefiles define -DSOURCE_DIR=$(@D) and our macros can use it when __INTEL_COMPILER is defined?

jeremylt commented 5 years ago

Source and target differ for us, but I am making it work with -DSOURCE_DIR='"$(abspath $(<D))"'. For Fortran tests, the JiT source is actually in a separate file in the source dir, so I am going to use this for all of our Fortran files for any compiler.

The only problem I have left is getting Nek to respect compiler macros defined in the passed in flags.