ValeevGroup / tiledarray

A massively-parallel, block-sparse tensor framework written in C++
GNU General Public License v3.0
254 stars 52 forks source link

How to build project with your library? #151

Closed weilewei closed 5 years ago

weilewei commented 5 years ago

Hi,

I am trying to import your library to build my own project, but I meet errors. To test, first, I build the library in Release mode sucessfully (I think), then create an independent folder where has your test file demo.cpp. In this folder, I first use command mpicxx demo.cpp -I /home/wwei/src/install/tiledarray_Release/include -L /home/wwei/src/install/tiledarray_Release/lib -ltiledarray -lMADworld -lopenblas -o demo to generate excutable, then type LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/home/wwei/src/install/tiledarray_Release/lib ./demo. demo can produce some of results but will encounter Illegal instruction (core dumped) right after the results of a7 is shown.

However, inside your library environment, I can run demo executable without error, which is a good thing for my Fedora based cluster machine. Do you have any suggestions on how to build a project from your library? Any suggestions will help, especially how to specify dependencies or what command lines will I need.

justusc commented 5 years ago

I am not sure where the illegal instruction is coming from. Typically this happens if your application or one of the libraries is compiled with CPU architecture flags that do not match the instructions supported by your machine. In the case of MPI (assuming you are running in a distributed memory environment), this could be the case if you are building on a machine that supports a different set of instructions than the compute nodes where it is run.

One possibility is openblas. I have had issues with that package in the past. If you are running on an Intel CPU, I recommend using MKL. If you are running in a data center, use the BLAS library that is provided by the system admins.

You need to be sure that you are compiling your application with the same compile flags as was used to compile the test binaries and libraries. You can inspect the compile flags when building TA with make VERBOSE=1. TA also includes a CMake config file, tiledarray-config.cmake if I remember correctly, that you can use in your own CMake project to set the compile and linker flags correctly. You can see documentation here https://cmake.org/cmake/help/v3.0/command/find_package.html.

weilewei commented 5 years ago

Hi, I can build and run an independent project outside of your library now. I can run demo in another directory with some flags.

I have a question: how can I check if the tiled array data structure is allocated to different machines/nodes using mpirun? In other words, how can I verify that the well-partitioned sub-arrays are stored and then performed related operations inside different nodes? I attempted to use mpirun -n 2 ./demo, but it crashed. Correct me if I understand your library in a wrong way.

Thanks for your support again.

justusc commented 5 years ago

The exact distribution of data will depend on the tiling specified in the TA::TiledRange provided to the array constructor and the number of processes. The data of an array is distributed by tiles; meaning tiles cannot be subdivided among nodes. The optimal size of the tiles depends on several factors. You typically get better cpu utilization with large tiles, but poor parallelization. You get better parallelization with small tiles, but tiles that are too small will lead to poor performance.

A good starting point is for each array have a minimum of 4 tiles per core and ~10000 elements tiles. This should yield acceptable performance. You will need to tune the tile sizes and number of tiles per array to achieve optimal performance for your application. The optimal tile size and number of tiles can also very between computer systems.

You can determine which tiles are are stored on a local node using the process map TA::DistArray::pmap() (see https://github.com/ValeevGroup/tiledarray/blob/master/src/TiledArray/dist_array.h#L647 and https://github.com/ValeevGroup/tiledarray/blob/master/src/TiledArray/pmap/pmap.h). The process map, determines which tiles belong to which node. You can look at the TA::make_array() function to get an idea of how data is distributed https://github.com/ValeevGroup/tiledarray/blob/master/src/TiledArray/conversions/make_array.h#L71. This function iterates over all local tiles in order to initialize local tiles.

To perform operations on an array, the easiest way is to use the expression operators (+, -, *, expression functions, etc.). You can see good examples of this in https://github.com/ValeevGroup/tiledarray/blob/master/examples/fock/ta_k_build.cpp. TA also provides a TA::foreach() and TA::foreach_inplace() function that you can use to create a new array from one or two input array or modify an array in-place. I highly recommend using these functions and operators whenever possible as it handles all of the parallelization for you. There is also a set of helper functions that handles all the boilerplate stuff you need to initialize an array https://github.com/ValeevGroup/tiledarray/blob/master/src/TiledArray/conversions/make_array.h. There are several other useful functions in https://github.com/ValeevGroup/tiledarray/tree/master/src/TiledArray/conversions directory that you may want to look at. If these functions do not fit your needs, you can use them as a guild in writing your own code.