JeffersonLab / qphix

QCD for Intel Xeon Phi and Xeon processors
http://jeffersonlab.github.io/qphix/
Other
13 stars 11 forks source link

Full Clover does not work with QDP-JIT Build #116

Open bjoo opened 6 years ago

bjoo commented 6 years ago

The full clover packers from Twisted-Mass don't have pack functions qdp_pack_full_clover<>()

in qdp_packer_qdpjit.h

This prevents the RandomGauge utility from being built and with it a slew of the tests. The workaroudn for now is to not build tests when building with QDP-JIT.

kostrzewa commented 6 years ago

Hmm, I guess that's unacceptable for you guys. Since you will almost certainly never require the "full" clover term, it would probably be best if the type did not exist if compilation of the twisted mass stuff is disabled. Not sure at this point how easy it is to achieve this.

For us on the other hand there's no urgent need to have QDP-packers for this data type, jit or reqular, since the real tests of our operators are within tmLQCD.

Perhaps a cleaner workaround can be found, such that you can still build all the tests that you require. @martin-ueding do you remember if it would be relatively straightforward to do this?

bjoo commented 6 years ago

Well, for QDP-JIT builds we can disable the tests. For regular QDPXX, the random gauge still needed the full clover packer, tho in that case we did have a packer for it in qdp_packer_parscalar.h and so we could build by enabling the tm-clover and then the tests built. However, it would be nice if this could be cleaned up so that we could compile tests without having to worry about FullClover.

Best wishes, Balint

On Jan 29, 2018, at 11:21 AM, Bartosz Kostrzewa notifications@github.com wrote:

Hmm, I guess that's unacceptable for you guys. Since you will almost certainly never require the "full" clover term, it would probably be best if the type did not exist if compilation of the twisted mass stuff is disabled. Not sure at this point how easy it is to achieve this.

For us on the other hand there's no urgent need to have QDP-packers for this data type, jit or reqular, since the real tests of our operators are within tmLQCD.

Perhaps a cleaner workaround can be found, such that you can still build all the tests that you require. @martin-ueding do you remember if it would be relatively straightforward to do this?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.


Dr Balint Joo High Performance Computational Scientist Jefferson Lab 12000 Jefferson Ave, Suite 3, MS 12B2, Room F217, Newport News, VA 23606, USA Tel: +1-757-269-5339, Fax: +1-757-269-5427 email: bjoo@jlab.org

martin-ueding commented 6 years ago

I would think that this can be cleaned up such that one can build the tests without FullClover. I would guess that one has to add some more #ifdef into the code, especially RandomGauge. I can take a look at this in the next couple of days.

bjoo commented 6 years ago

Hi Martin, That would be great. However, no hurry as we have workarounds. Many thanks and best wishes, Balint

On Jan 29, 2018, at 11:28 AM, Martin Ueding notifications@github.com wrote:

I would think that this can be cleaned up such that one can build the tests without FullClover. I would guess that one has to add some more #ifdef into the code, especially RandomGauge. I can take a look at this in the next couple of days.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.


Dr Balint Joo High Performance Computational Scientist Jefferson Lab 12000 Jefferson Ave, Suite 3, MS 12B2, Room F217, Newport News, VA 23606, USA Tel: +1-757-269-5339, Fax: +1-757-269-5427 email: bjoo@jlab.org

martin-ueding commented 6 years ago

I think I am now at the point where I have this reproduced. On my Fedora laptop, there is no CUDA, so I needed to set up QMP, QDP-JIT and QPhiX on my institute workstation. It does not compile, the first error messages that I have are these:

In file included from /home/ueding/Lattice/Code/qphix/tests-gtest/../tests/clover_term.h:14:0,
                 from /home/ueding/Lattice/Code/qphix/tests-gtest/../tests/RandomGauge.h:3,
                 from /home/ueding/Lattice/Code/qphix/tests-gtest/random_gauge.cc:1:
/home/ueding/Lattice/Code/qphix/tests-gtest/../tests/./clover_term_llvm_w.h:579:36: error: 'JitFunction' does not name a type
 void function_make_clov_exec(const JitFunction &function,
                                    ^~~~~~~~~~~
/home/ueding/Lattice/Code/qphix/tests-gtest/../tests/./clover_term_llvm_w.h: In function 'void QPhiX::function_make_clov_exec(const int&, const RealT&, const U&, const U&, const U&, const U&, const U&, const U&, X&, Y&)':
/home/ueding/Lattice/Code/qphix/tests-gtest/../tests/./clover_term_llvm_w.h:602:25: error: request for member 'func' in 'function', which is of non-class type 'const int'
   jit_dispatch(function.func().at(0),
                         ^~~~
/home/ueding/Lattice/Code/qphix/tests-gtest/../tests/./clover_term_llvm_w.h:604:39: error: there are no arguments to 'getDataLayoutInnerSize' that depend on a template parameter, so a declaration of 'getDataLayoutInnerSize' must be available [-fpermissive]
                getDataLayoutInnerSize(),
                                       ^
/home/ueding/Lattice/Code/qphix/tests-gtest/../tests/./clover_term_llvm_w.h:604:39: note: (if you use '-fpermissive', G++ will accept your code, but allowing the use of an undeclared name is deprecated)

I will dig deeper into this later on.

bjoo commented 6 years ago

Hi Martin, You don’t need QUDA to compile QDP-JIT for CPU/KNL. Getting this working is a little tricky as it involves building also LLVM. The branch from Frank’s repository to use for CPUs is

git@github.com:fwinter/qdp-jit

branch: llvm-cpu-inner-loop-no11-qshift

should compile with llvm-4.0.0 I have a modification in git@github.com:bjoo/qdp-jit which I have submitted a PR on and can build instead either with llvm-5.0.0 or a particular version of llvm6-trunk (same branchname). Also as usual I can supply a build package with all the of this stuff for straightforward building.

Best, B

On Jan 29, 2018, at 5:16 PM, Martin Ueding notifications@github.com wrote:

I think I am now at the point where I have this reproduced. On my Fedora laptop, there is no CUDA, so I needed to set up QMP, QDP-JIT and QPhiX on my institute workstation. It does not compile, the first error messages that I have are these:

In file included from /home/ueding/Lattice/Code/qphix/tests-gtest/../tests/clover_term.h:14:0, from /home/ueding/Lattice/Code/qphix/tests-gtest/../tests/RandomGauge.h:3, from /home/ueding/Lattice/Code/qphix/tests-gtest/random_gauge.cc:1: /home/ueding/Lattice/Code/qphix/tests-gtest/../tests/./clover_term_llvm_w.h:579:36: error: 'JitFunction' does not name a type void function_make_clov_exec(const JitFunction &function, ^~~ /home/ueding/Lattice/Code/qphix/tests-gtest/../tests/./clover_term_llvm_w.h: In function 'void QPhiX::function_make_clov_exec(const int&, const RealT&, const U&, const U&, const U&, const U&, const U&, const U&, X&, Y&)': /home/ueding/Lattice/Code/qphix/tests-gtest/../tests/./clover_term_llvm_w.h:602:25: error: request for member 'func' in 'function', which is of non-class type 'const int' jit_dispatch(function.func().at(0), ^~~~ /home/ueding/Lattice/Code/qphix/tests-gtest/../tests/./clover_term_llvm_w.h:604:39: error: there are no arguments to 'getDataLayoutInnerSize' that depend on a template parameter, so a declaration of 'getDataLayoutInnerSize' must be available [-fpermissive] getDataLayoutInnerSize(), ^ /home/ueding/Lattice/Code/qphix/tests-gtest/../tests/./clover_term_llvm_w.h:604:39: note: (if you use '-fpermissive', G++ will accept your code, but allowing the use of an undeclared name is deprecated)

I will dig deeper into this later on.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.


Dr Balint Joo High Performance Computational Scientist Jefferson Lab 12000 Jefferson Ave, Suite 3, MS 12B2, Room F217, Newport News, VA 23606, USA Tel: +1-757-269-5339, Fax: +1-757-269-5427 email: bjoo@jlab.org

martin-ueding commented 6 years ago

I now have CUDA with C on my workstation. I have used the master branch of fwinter/qdp-jit and installed that without extra configuration options.

But QPhiX does not compile with the following error:

In file included from /home/ueding/Lattice/Code/qphix/tests-gtest/../tests/clover_term.h:14:0,
                 from /home/ueding/Lattice/Code/qphix/tests-gtest/../tests/RandomGauge.h:3,
                 from /home/ueding/Lattice/Code/qphix/tests-gtest/random_gauge.cc:1:
/home/ueding/Lattice/Code/qphix/tests-gtest/../tests/./clover_term_llvm_w.h:579:36: error: 'JitFunction' does not name a type
 void function_make_clov_exec(const JitFunction &function,
                                    ^~~~~~~~~~~

The strange thing is that the only occurrence in the QPhiX codebase is within a class. But these free functions use it as a parameter nevertheless:

$ git grep -n JitFunction
tests/clover_term_llvm_w.h:579:void function_make_clov_exec(const JitFunction &function,
tests/clover_term_llvm_w.h:611:void function_make_clov_build(JitFunction &func,
tests/clover_term_llvm_w.h:776:  static JitFunction function;
tests/clover_term_llvm_w.h:836:void function_ldagdlinv_exec(const JitFunction &function,
tests/clover_term_llvm_w.h:861:void function_ldagdlinv_build(JitFunction &func,
tests/clover_term_llvm_w.h:1060:  static JitFunction function;
tests/clover_term_llvm_w.h:1120:void function_triacntr_exec(const JitFunction &function,
tests/clover_term_llvm_w.h:1148:void function_triacntr_build(JitFunction &func,
tests/clover_term_llvm_w.h:1490:  static JitFunction function;
tests/clover_term_llvm_w.h:1521:void function_apply_clov_exec(const JitFunction &function,
tests/clover_term_llvm_w.h:1549:void function_apply_clov_build(JitFunction &func,
tests/clover_term_llvm_w.h:1666:  static JitFunction function;

This identifier does not show up in QDP-JIT on the master branch.

What am I doing wrong?

kostrzewa commented 6 years ago

Can we not just disable the any tests involving FullClover? We can't really test it from QDP anway...

kostrzewa commented 6 years ago

i.e., we remove them from the compilation process entirely, at least for now...

martin-ueding commented 6 years ago

We can do this, the RandomGauge class needs to be changed such that it does not require the FullCloverBlock.

But nevertheless, the above issue with the JitFunction has nothing to do with the twisted mass code, it is just something from the included file which uses an identifier which I cannot find somewhere else. There is no warning about a failed #include, so I presume the identifier should be somewhere in the code, but it isn't.

Or it is something from LLVM, but then there is an #include missing.