dealii / dealii

The development repository for the deal.II finite element library
https://www.dealii.org
Other
1.37k stars 744 forks source link

Assessment of the difficulty in porting CPU architecture for dealii #16289

Open wangyuliu opened 11 months ago

wangyuliu commented 11 months ago

Hello everyone! I am working on implementing a tool to assess the complexity of CPU architecture porting. It primarily focuses on RISC-V architecture porting. In fact, the tool may have an average estimate of various architecture porting efforts.My focus is on the overall workload and difficulty of transplantation in the past and future,even if a project has already been ported.As part of my dataset, I have collected the dealii project. I would like to gather community opinions to support my assessment. I appreciate your help and response! Based on scanning tools, the porting complexity is determined to be high, with a significant amount of code related to the CPU architecture in the project. Is this assessment accurate?Do you have any opinions on personnel allocation and consumption time? I look forward to your help and response.

peterrum commented 11 months ago

@wangyuliu Most of deal.II uses standard C++ 17 or 20 commands. In the case that there are RISC-V compiles that support C++ 17/20 most of the code should compile without issues.

However, there are some optimized code paths that use intrinsics (SEE2, AVX, AVX-512 ISA): see https://github.com/dealii/dealii/blob/master/include/deal.II/base/vectorization.h . However users don't work with these directly but rather on wrapper classes (similar to std::simd). This way we also support ARM NEON (https://github.com/dealii/dealii/blob/f1c6d6ed29af6f2fd62a613c40188e4b348ee31b/include/deal.II/base/vectorization.h#L6221) and AltiVec (https://github.com/dealii/dealii/blob/f1c6d6ed29af6f2fd62a613c40188e4b348ee31b/include/deal.II/base/vectorization.h#L5043). SIMD operations of RISC-V, if available, could be added here. Feel free to added such a class! I don't think it involves too much work; see https://github.com/dealii/dealii/pull/15923.

bangerth commented 11 months ago

@wangyuliu You raise an interesting question. Can you explain how your tool came up with the assessment that porting deal.II has high complexity? Like @peterrum already mentioned, deal.II has ways to support vector instructions, but the default code paths should compile without any changes at all on new platforms -- we think that deal.II should just compile with a C++17 compliant compiler on any platform you choose.

wangyuliu commented 11 months ago

Thank you very much for your above answer! Our tool has tested that deaii contains a large amount of assembly code, Intrinsic functions, and many conditional compilation with macros, such as code segment containing #if defined(KOKKOS_ENABLE_ASM) && defined(KOKKOS_ENABLE_ISA_X86_64) && \ !defined(_WIN32) && !defined(__CUDA_ARCH__). These code structures are likely to need to be modified when facing different CPU architectures, and we believe that these code structures play a role in automating the evaluation of porting difficulty.Based on the large amount of architecture related code mentioned above, we believe that porting the CPU architecture is quite challenging. Is this correct?

bangerth commented 11 months ago

@wangyuliu I see, but it's not correct :-) Your code likely also looked into the bundled/ directory, where we have copies of the BOOST and Kokkos libraries that indeed have a lot of these #ifdef things. You may want to exclude the bundled/ directory when running your tool.

wangyuliu commented 11 months ago

@wangyuliu I see, but it's not correct :-) Your code likely also looked into the directory, where we have copies of the BOOST and Kokkos libraries that indeed have a lot of these things. You may want to exclude the directory when running your tool.bundled/``#ifdef``bundled/

Thank you very much for your suggestion. By excluding the influence of other libraries, our tool did collect less architecture-related code, but there are still many Intrinsic functions, so the tool predicts it to be medium porting complexity. Do you think it is correct? I Want to have a more qualitative judgment? Or do you think it can only be assessed as simple? I can give you some evaluation criteria. libtiff, libcurl, arrayfile, libtool, dhcp, etc. all have low porting complexity. Looking forward to your further answers.

bangerth commented 11 months ago

@wangyuliu I think it all comes down to what you mean by "porting". I am convinced that you can compile deal.II with essentially no modifications at all on new platforms. In fact, we do this kind of thing all the time (for example when Apple came out with ARM chips, or when one of us got access to one of the new ARM-based supercomputers in East Asia). This works essentially out of the box.

The reason you find these intrinsics in the code base of deal.II is because we optimize some things on common platforms. The way this works is by essentially having code such as

#if ...Intel platform with AVX512 ...
  use AVX512 intrinsics
#elsif ...Intel platform with AVX ...
  use AVX intrinsics
#elsif ...ARM platform with vector intrinsics...
  use ARM-style intrinsics
#else
  use a loop over the elements of an array, written in plain C++ and without intrinsics
#endif

This kind of scheme implies that you do not have to do very much if you want to port to a new platform, simply letting the compiler do the work. But you can do a lot of work for a new platform if you think that these code paths are important to the performance of your code.

In other words, you could label deal.II as having medium or high porting complexity, but that implies that you want to do substantially more than is strictly necessary. If you just want to get things to run on a new platform, I believe that deal.II has quite low porting complexity.

Does this make sense?