Closed jeremylt closed 5 years ago
Is this still open? If so, better tag someone involved in that component?
It is still open - I haven't figured this out yet. @v-dobrev , @thilinarmtb do you have any thoughts on what may be going wrong?
Cc: @camierjs
@jeremylt, could you produce an output of a failing test, setting export DBG=1 before your run? That should set the debug verbose mode, helping us perhaps with the CeedOklPath ones.
With the t20-qfunction C test, you have the abs path:
[CeedOklPath] Current OKL is /home/jeth8984/libceed/tests/t20-qfunction.okl [CeedOklPath] Final OKL is /home/jeth8984/libceed/tests/t20-qfunction.okl
but with the fortran t20 and t30 tests and your compiler, you don't have any path (__FILE__ just gives you the filename):
[CeedOklPath] Current OKL is t20-qfunction-f.okl [CeedOklPath] Could NOT stat this OKL file: t20-qfunction-f.okl
This leads the CeedOklPath function to try the different locations: OCCA cache & libceed path:
[CeedOklPath] Trying occa://ceed/t20-qfunction-f.okl [CeedOklPath] Stating /home/jeth8984/.occa/libraries/ceed/t20-qfunction-f.okl [CeedOklPath] Could NOT stat OCCA cache: /home/jeth8984/.occa/libraries/ceed... [CeedOklPath] Trying fron libceed: /home/jeth8984/libceed/lib/okl/t20-qfunction-f.okl CeedOklPath_Occa Cannot find OKL file!
You should try to OCCA cache this kernel as a work-around.
Looking at Intel's manual, they don't add the full path.
Caching fixed the problem. Do we want to mark the issue as resolved or do we want to find a fix that does not involve caching? @jedbrown @camierjs
We want a solution that doesn't require manual caching. I don't know if that is a libCEED issue or an OCCA issue.
In OCCA there is an abstract base class fileOpener
that is used for searching for .okl
files. It is quite convenient, I used it in MFEM to define url-handler for mfem-occa://
prefix. See url_handler.hpp and url_handler.cpp. Basically, this url-handler can use any prefix and it will search a list of paths to find the specified file. The list of directories is defined by the library/application (e.g. libceed) and it will also add directories from an environment variable (the name of the variable is controlled by the library/application), if defined.
Even with #134, I can confirm that this error still occurs, for t400-f
, t500-f
and t501-f
.
After investigating, this issue is a matter of the ifort compiler failing to provide the absolute path when intended, rather than a libCEED issue. (See https://software.intel.com/en-us/fortran-compiler-developer-guide-and-reference-using-fpp-preprocessor-directives only the filename is provided, rather than the name + path provided by GCC) This affects any use of the macro __FILE__
in a Fortran file to provide the filepath to a QFunction for JiT, for the CUDA or OCCA backends.
Is it worth fighting Intel's nonstandard implementation and getting ifort to behave correctly for our test suite, or should we close this?
How about if our makefiles define -DSOURCE_DIR=$(@D)
and our macros can use it when __INTEL_COMPILER
is defined?
Source and target differ for us, but I am making it work with -DSOURCE_DIR='"$(abspath $(<D))"'
. For Fortran tests, the JiT source is actually in a separate file in the source dir, so I am going to use this for all of our Fortran files for any compiler.
The only problem I have left is getting Nek to respect compiler macros defined in the passed in flags.
On CU's Summit, using Intel compilers, I get the following error when running the Fortran q-function and operator tests, for both ocl and omp.
+CEED-OCCA error @ /home/jeth8984/libceed/backends/occa/ceed-occa-okl.c:58 CeedOklPath_Occa
Strangely, there is no similar error on ex1 for ocl or omp.
Full text of the error is here: Error Output Modules loaded is here: Modules List
This appears to be related to the issue discussed in PR #68