KratosMultiphysics / Kratos

Kratos Multiphysics (A.K.A Kratos) is a framework for building parallel multi-disciplinary simulation software. Modularity, extensibility and HPC are the main objectives. Kratos has BSD license and is written in C++ with extensive Python interface.
https://kratosmultiphysics.github.io/Kratos/
Other
1.01k stars 244 forks source link

Integrating Eigen in the Core #6126

Closed philbucher closed 4 years ago

philbucher commented 4 years ago

This is an issue abt integrating the Eigen-library in the Core. Main Points that I can think of right now:

@KratosMultiphysics/technical-committee @armingeiser @oberbichler

armingeiser commented 4 years ago

+1 But note that Intel MKL Pardiso is not included in Eigen itself, still the direct solvers from Eigen are not bad.

philbucher commented 4 years ago

Replying to the comment of @RiccardoRossi from #6111

@RiccardoRossi I don’t fully understand why we make ourselves dependent on numpy. I mean, if I want to use the numpy features then ofc I need to have it installed, but looking at „pybind11/numpy.h“ I don’t see any includes from numpy itself, hence the compilation should remain independent Also the features we would use it for such as discussed in #6111 is anyway not sth to be used by a GUI. Maybe I am missing sth ...? Maybe we can discuss it sometime

@oberbichler do you know more?

RiccardoRossi commented 4 years ago

the problem is that if you do

import numpy

in any python script than the "runkratos" will not work. And if you don't do that it is for nothing...

that is, in order to be able to import the numpy anywhere we need first to solve the issue of packaging it into the runkratos (or find another way to package kratos to go together with the GID releases)

armingeiser commented 4 years ago

Including Eigen in the core is first of all decoupled from numpy. The solvers alone could be reason enough. The support for the Eigen/numpy conversion utilities could be switched off using a compiler flag.

https://github.com/KratosMultiphysics/Kratos/blob/master/kratos/python_scripts/scipy_conversion_tools.py

There is already a python script in the core with import scipy, that from my understanding does give the exact same problems for packaging (if not more). That a utility is there, on python side, does not mean it has to be used at runtime.

philbucher commented 4 years ago

the problem is that if you do

import numpy

in any python script than the "runkratos" will not work. And if you don't do that it is for nothing...

that is, in order to be able to import the numpy anywhere we need first to solve the issue of packaging it into the runkratos (or find another way to package kratos to go together with the GID releases)

Ok now I understand what you mean. Ofc we would limit the usages of numpy until the packaging issue is solved. But even before I think for new things (like exposing the sys-matrices) it would be very beneficial I think, and that will not be used by the gui

philbucher commented 4 years ago

Its not like we would start to rewrite things to be purely based on numpy... So far we also managed without

pooyan-dadvand commented 4 years ago

I used to work with Eigen while implementing the AMatrix and is a very complete and also fast library. Also, as AMatrix uses a very similar interface, most of the effort we did to incorporate AMatrix can give us benefit here.

Meanwhile, I have some major concerns about using Eigen in the core:

  1. We are trying to remove the ublas for its complexity and dependencies and compilation time, in favor of another big library with its compilation dependencies/restrictions. I know that is faster but is not less complex.
  2. The codebase don't use c++11 features like move semantic. This makes it a very complex library to understand and eventually debug something. (like a memory leak or uninitialized values)
  3. It has a problem nonstrict alignment of variables and basically you cannot pass an Eigen vector or matrix by value in a function. This is true that we usually pass vectors by reference for efficiency reason but in our wide range of inexperienced developers I don't think we can count on that all of them will pass every vector by reference. Please note that passing by value results in a mysterious stack crash.
  4. Debugging an operation is very difficult due to the levels of the template metaprogramming.
RiccardoRossi commented 4 years ago

ouch, this

https://eigen.tuxfamily.org/dox/group__TopicPassingByValue.html

is really terrible :-(

oberbichler commented 4 years ago

My comments as someone who is using Eigen for many years in the daily work:

Ad 1: A complete linear-algebra-library will always be complex. uBlas is completely outdated (=dead). It is overdue to switch to a modern library.

Ad 2: I've never had to debug Eigen.

Ad 3: AFAIK there is no other way to achieve this with SSE/C++11 with any library. If you don't want to use SSE then just switch it off. I've never had such problems in practice but since I am using C++17 I don't need to bother about such things.

Ad. 4: See 2

Eigen has a very large community. It is used in many big and commercial programs. That is the reason why it is well tested, has a complete feature set and is supported by many other libraries (e.g. pybind11). But of course there is always room for improvement. It is open-source and you can always bring in your own suggestions.

Of course you can also develop your own linear algebra library. But then you have to spend a lot of time and resources to develop all these features yourself. When I solve problems I do not want to deal with these basics.

RiccardoRossi commented 4 years ago

reposting this here (originally posted in the wrong issue)

i ll start by saying that i was originally very much in favor (aside of licensing doubts) of switching to eigen, I also agree that ublas is a dead end. also, i have been reading about both eigen and blaze.

long sotry shirt, I am currently (much) less clear that it is a good idea than i was some time ago.

here go a few detais that i see as problematic (for a deep integration, more on this later): 1 - eigen is essentially a c++03 lib. examples of this is the lack of usage of initializer lists (according to docs), problematic use of "auto" and, as i understand, lack of use of modern alignement mechanisms (c++11-17). i wonder if we woukd not be jumping onto an already old lib 2 - i see as a blocker that eigen disallows pass by value. independently on considering if it is a good idea or not (personally, i do not see anything wrong about doing it, if what you want is a copy), it is a feature currently used through the code. without compiler support i do not see any way of fixing this systematically. 3 - i am not clear about openmp in eigen. we would need openmp for sparse matrices but not for dense ones (which for us are pretty tiny). as i understand one can either activate it for all eigen or completely disallow it. 4 - we woukd need a compatibility layer the same way we did for Amatrix. i am not sure of the implications/feasibility of it in the case of Eigen. 5 - i think that for maximal performance (and strictly optionally) eigen uses internally blas for some kernels. i hate the idea of having the core to depend on blas/lapack.

having said this, and without entering in the pybind/numpy/discussion for now, i think we have a few open possibilities with different level of intrusiveness:

1 - (minimally intrusive) leave it as now but packaging eigen by default. (would need to remain optional, and everything should work without, as now) 2 - use eigen only for sparse linear algebra. this would imply changing builder and solver, etc, but not all the rest of kratos. on the bright side eigen sparse would be always available. Question: is there any alternative aside of eigen? If 2 is the decision in my opinion we should explicitly disallow using eigen for dense matrices to avoid it being mixed with the library we use for tiny matrices. We would then need our own (or another lib) for tiny matrices. 3 - using eigen for all (or an other lib, say blaze). if that is the decision, than we should also design a migration strategy, ideally with compiler support. (runtime segfaults for correct code are not admissible...). also i would like my comments above to be answered...

finally, and about numpy/scipy if anyone has an idea of how to write a "runkratos" that supports it on a clean computer, proposals are very welcome.

for now my only idea is to have a "runkratos.py" which looks something like

from numpy import *
from scipy import *

exec( ... mainkratos.py... )

and to hope that nuitka (or cxfreeze) will correctly package numpy and scipy dependencies, including blas and lapack (...shiver...)

RiccardoRossi commented 4 years ago

i also think we should consider this

https://www.google.com/url?sa=t&source=web&rct=j&url=http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1385r4.pdf&ved=2ahUKEwiup77EnMfmAhVK1hoKHbx_AI4QFjAAegQIAhAF&usg=AOvVaw1fQVnQiyZDUgR1oqe6cG8B

with the related implemebtation here

https://github.com/hatcat/linear_algebra?files=1

oberbichler commented 4 years ago

The question is whether you want to/can do everything yourself or take advantage of the open source community.

RiccardoRossi commented 4 years ago

not exactly. the point is making a wise choice, (which doesn t break badly our existing code).

do you have any solution for my remarks using eigen?

oberbichler commented 4 years ago

ad 1: Alignment works fine. Each library becomes older and will die at some point. At the moment Eigen seems to be the best available library.

ad 2: I never had problems when I want to pass a vector by value.

ad 3: AFAIK it will never use OMP for small matrices. (Btw: is there OMP support in AMatrix?)

ad 4: No idea. I always keep it simple and use Eigen directly. In my opinion additional layers are not a good idea.

ad 5: There is no BLAS/LAPACK dependency.

Eigen is a great library. I'm sure there are alternatives. Someday something new will come along. I will not be the developer of the successor. I have enough other things to do. Just my personal opinion...

RiccardoRossi commented 4 years ago

regarding 1, we need to make a choice for the next years...the std proposal looks at least inter sting to me !proposers are in the standard committee). btw, we could even ask someone from eigen or from blaze or one of this guys of the std to participate in this conversation

regarding 2, doc say that pass by value segfaults for stack allocated matrices. i will check it on my own.

regarding 3, googling a bit i found the following: "In the case your application is parallelized with OpenMP, you might want to disable Eigen's own parallization as detailed in the previous section". the problem indeed is that you DO NOT want to have openmp in our small dense matrices. but you do want it for large matrices. (nested openmp is always painful)

regarding 4 - we need a wrapper to emulate the ublas interface. we have simply too much code to avoid it. anyway it is also a way of futureproofing. that of course is not an issue for toy projects (you can simply replace the interface by hand if the project is small or mantained by a single person), but it is for a library as big as kratos.

loumalouomega commented 4 years ago

I just want to mention taht xtensor is numpy inspired... https://github.com/xtensor-stack/xtensor

loumalouomega commented 4 years ago

i also think we should consider this

https://www.google.com/url?sa=t&source=web&rct=j&url=http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1385r4.pdf&ved=2ahUKEwiup77EnMfmAhVK1hoKHbx_AI4QFjAAegQIAhAF&usg=AOvVaw1fQVnQiyZDUgR1oqe6cG8B

with the related implemebtation here

https://github.com/hatcat/linear_algebra?files=1

Apart of xtensor, https://github.com/sgorsten/linalg is also quite complete and tested (C++11 header only,etc...)

oberbichler commented 4 years ago

1 The proposal will take some time to get into the standard and Kratos still uses C++11…

2 Of courses you have restrictions when using other libraries. As long as you can do everything you need I dont see any problem…

3 Yes… And I think it is only used for large matrices. Nested parallelism is a problem of OMP and not Eigen itself… Anyway: I used Eigen together with OMP and tbb. Always with very good performance.

4 …

xtensor is cool but as the name says a tensor library (like numpys ndarray). An overhead if you just use matrices and vectors.

I never used linalg.

Just my experience. I am off from this discussion now (for the next weeks…).

pooyan-dadvand commented 4 years ago

Just to clarify that we are not saying that Eigen is not a good library. In fact, it is one of the bests and one of the most complete ones. The question is if its features fit our requirements or are simply too big/complex in our case. Also to check if its cons are important to us or not. Our criteria are the same as when we were evaluating if using the Open Cascade is overkill or not.

pooyan-dadvand commented 4 years ago

@KratosMultiphysics/technical-committee discussed about this and we do the following remarks:

  1. There are three different layers of use of linear algebra: dense matrices and vectors, sparse matrices and linear solvers.
  2. The use of Eigen as linear solver is very useful. This can make the EigenSolversApplication a core application like ExternalSolversApplication
  3. For using Eigen as sparse matrix we may use the same mechanism as External solver and maintain it in the application to have more flexibility in compiling the core
  4. For using Eigen as dense algebra, we should check the pro/cons of other options: a. Amatrix: The adhoc library for our use but very limited b. Blaze: A new library with promising performance c. CPP standard proposal: To be future proof?

So at this stage the decision is to have:

  1. Adding Eigen source code to the external solver folder of the EigenSolversApplication
  2. The EigenSolversApplication compiled by default (as a separate application as it is)
  3. Using solvers given by Eigen solvers instead of ExternalSolversApplication.
  4. Removing the dependency of blas and lapack for our standard compilation and leave the ExternalSolversApplication as optional. Note that the long term idea would be to completely deprecate and eventually remove such application, something that would imply losing FEAST solver in favor of the EigenValue solver from Eigen library. SuperLU and Pastix would be also available through Eigen interface.
loumalouomega commented 4 years ago
4. For using Eigen as dense algebra, we should check the pro/cons of other options:
   a. Amatrix: The adhoc library for our use but very limited
   b. Blaze: A new library with promising performance
   c. CPP standard proposal: To be future proof?

Here there is a comparison Blaze vs Eigen (at the end, also xtensor, etc...): https://bitbucket.org/blaze-lib/blaze/issues/266/blaze-vs-eigen

loumalouomega commented 4 years ago
4. Removing the dependency of blas and lapack for our standard compilation and leave the `ExternalSolversApplication` as optional. Note that the long term idea would be to completely deprecate and eventually remove such application, something that would imply losing FEAST solver in favor of the EigenValue solver from Eigen library. SuperLU and Pastix would be also available through Eigen interface.

I would consider the opurtunity to update both solvers, we are using in both cases the lagacy versions of Pastix and SuperLU

pooyan-dadvand commented 4 years ago
4. For using Eigen as dense algebra, we should check the pro/cons of other options:
   a. Amatrix: The adhoc library for our use but very limited
   b. Blaze: A new library with promising performance
   c. CPP standard proposal: To be future proof?

Here there is a comparison Blaze vs Eigen (at the end, also xtensor, etc...): https://bitbucket.org/blaze-lib/blaze/issues/266/blaze-vs-eigen

Intresting! However this comparison is for medium size matrices (size~80). We should do the same with small ones with sizes: 3,4,6,8,12,16 which are the ones we use most in our code.

RiccardoRossi commented 4 years ago

done, so closing