Mantevo / miniFE

MiniFE Finite Element Mini-Application
http://www.mantevo.org
GNU Lesser General Public License v3.0
28 stars 31 forks source link

x * y * z * 3^3 / ppn can exceed 2^31 on modern machines #10

Open rolfriesen opened 5 years ago

rolfriesen commented 5 years ago

Running on a 192 GB dual socket machine. Using the MPI + OMP version in miniFE_openmp_opt export OMP_NUM_THREADS=11 mpirun -n 4 -ppn 4 ./miniFE.x nx=682 ny=682 nz=682 throws an exception because nrows_max in CSRMatrix.hpp turns negative due to int overflow. packed_cols.reserve(nrows_max); doesn't like negative numbers ;-)

mpirun -n 4 -ppn 4 ./miniFE.x nx=680 ny=680 nz=680 # works

Unfortunately making MINIFE_GLOBAL_ORDINAL a long is not sufficient to address the issue.

maherou commented 5 years ago

Hi Rolf,

C++ long is usually 32 bits. C++ long long is guaranteed to be at least 64 bits. Try setting MINIFE_GLOBAL_ORDINAL to long long and see if it helps.

maherou commented 5 years ago

Also, if your local problem ends up being > 2.1B in size, you can change MINIFE_LOCAL_ORDINAL to long long also.

rolfriesen commented 5 years ago

On Wed May 1, 2019 06:36:52, Mike Heroux wrote:

Mike Also, if your local problem ends up being > 2.1B in size, you can change MINIFE_LOCAL_ORDINAL to long long also. Mike

Hi Mike,

I'm using the Intel Parallel Studio XE 2018 Update 3 for Linux compilers. With that, long int and long long are both 8 bytes wide.

Unfortunately, changing MINIFE_LOCAL_ORDINAL to long long (or long int), causes compile errors because the function signature for init_matrix() no longer matches.

To fix this, some hard coded ints in miniFE need to be changed to MINIFE_LOCAL_ORDINAL. I did create a miniFE version where I chased down all those instances and made it work.

However, it it now consumes much more memory and runs 20% slower! I may have been a little overzealous with my int to MINIFE_LOCAL_ORDINAL conversion ;-) I wont have time in the next couple of weeks, but there might be a way to come up with a better fix by applying the substitution more judiciously and use type casts where appropriate.

Thanks,

Rolf

+++-+--+----+-------+------------+--------------------+------------------------ Rolf Riesen, Ph.D. Email: rolf.riesen@intel.com Software Architect Phone: +1 (503) 613-5514 Extreme-scale Software System Pathfinding Mobile: +1 (505) 363-6871

Outlook users: Turn off "extra line break removal" in File > Options > Mail > Message Format