Dhondtguido / PaStiX4CalculiX

Other
16 stars 7 forks source link

ccx & PaStiX #3

Closed 3rav closed 4 years ago

3rav commented 4 years ago

Hi,

I was able to compile ccx with PastiX, except the program doesn't work properly:

A. PaStiX, no PARSEC, no StarPU, no Scotch (PastixSchedStatic): program crash with warning (Windows pop-up) image

B. PaStiX, no PARSEC, no StarPU, Scotch yes (PastixSchedStatic): same as A

C. PaStiX, no PARSEC, StarPU yes, no Scotch (PastixSchedStatic): same as A

D. PaStiX, no PARSEC, StarPU yes, no Scotch (PastixSchedStarPU):

Not reusing csc.
+-------------------------------------------------+
+     PaStiX : Parallel Sparse matriX package     +
+-------------------------------------------------+
  Version:                                   6.0.1
  Schedulers:
    sequential:                            Enabled
    thread static:                         Started
    thread dynamic:                       Disabled
    PaRSEC:                               Disabled
    StarPU:                                Started
  Number of MPI processes:                       1
  Number of threads per process:                 4
  Number of GPUs:                                0
  MPI communication support:              Disabled
  Distribution level:                     2D( 128)
  Blocking size (min/max):             1024 / 2048

  Matrix type:  General
  Arithmetic:   Float
  Format:       CSC
  N:            1841
  nnz:          105279

+-------------------------------------------------+
  Ordering step :
    Ordering method is: Scotch
[starpu][check_bus_config_file] No performance model for the bus, calibrating...
[starpu][check_bus_config_file] ... done
pastix_subtask_order: Ordering with Scotch requires to enable -DPASTIX_ORDERING_SCOTCH optionpastix_task_sopalin: All steps from pastix_task_init() to pastix_task_blend() have to be called before calling this function

E. PaStiX, no PARSEC, StarPU yes, no Scotch, no Metis (PastixSchedStarPU, PastixOrderMetis):

Not reusing csc.
+-------------------------------------------------+
+     PaStiX : Parallel Sparse matriX package     +
+-------------------------------------------------+
  Version:                                   6.0.1
  Schedulers:
    sequential:                            Enabled
    thread static:                         Started
    thread dynamic:                       Disabled
    PaRSEC:                               Disabled
    StarPU:                                Started
  Number of MPI processes:                       1
  Number of threads per process:                 4
  Number of GPUs:                                0
  MPI communication support:              Disabled
  Distribution level:                     2D( 128)
  Blocking size (min/max):             1024 / 2048

  Matrix type:  General
  Arithmetic:   Float
  Format:       CSC
  N:            1841
  nnz:          105279

+-------------------------------------------------+
  Ordering step :
    Ordering method is: Metis
pastix_subtask_order: Ordering with Metis requires -DPASTIX_ORDERING_METIS flag at compile timepastix_task_sopalin: All steps from pastix_task_init() to pastix_task_blend() have to be called before calling this function

F. PaStiX, no PARSEC, StarPU yes, Scotch yes (PastixSchedStarPU): program crash without any warnings (NO Windows pop-up), only

Not reusing csc.
+-------------------------------------------------+
+     PaStiX : Parallel Sparse matriX package     +
+-------------------------------------------------+
  Version:                                   6.0.1
  Schedulers:
    sequential:                            Enabled
    thread static:                         Started
    thread dynamic:                       Disabled
    PaRSEC:                               Disabled
    StarPU:                                Started
  Number of MPI processes:                       1
  Number of threads per process:                 4
  Number of GPUs:                                0
  MPI communication support:              Disabled
  Distribution level:                     2D( 128)
  Blocking size (min/max):             1024 / 2048

  Matrix type:  General
  Arithmetic:   Float
  Format:       CSC
  N:            1841
  nnz:          105279

+-------------------------------------------------+
  Ordering step :
    Ordering method is: Scotch

G. PaStiX and Metis Working on Windows!

Not reusing csc.
+-------------------------------------------------+
+     PaStiX : Parallel Sparse matriX package     +
+-------------------------------------------------+
  Version:                                   6.0.1
  Schedulers:
    sequential:                            Enabled
    thread static:                         Started
    thread dynamic:                       Disabled
    PaRSEC:                               Disabled
    StarPU:                                Enabled
  Number of MPI processes:                       1
  Number of threads per process:                 4
  Number of GPUs:                                0
  MPI communication support:              Disabled
  Distribution level:                     2D( 128)
  Blocking size (min/max):             1024 / 2048

  Matrix type:  General
  Arithmetic:   Float
  Format:       CSC
  N:            449578
  nnz:          8006332

+-------------------------------------------------+
  Ordering step :
    Ordering method is: Metis
    Time to compute ordering:              6.4544 
+-------------------------------------------------+
  Symbolic factorization step:
    Symbol factorization using: Fax Direct
    Number of nonzeroes in L structure:   34936336
    Fill-in of L:                         4.363588
    Time to compute symbol matrix:        0.1380 
+-------------------------------------------------+
  Reordering step:
    Split level:                                 0
    Stoping criteria:                           -1
    Time for reordering:                  2.2381 
+-------------------------------------------------+
  Analyse step:
    Number of non-zeroes in blocked L:    69872672
    Fill-in:                              8.727176
    Number of operations in full-rank LU   :    12.74 GFlops
    Prediction:
      Model:                             AMD 6180  MKL
      Time to factorize:                  1.1509 
    Time for analyze:                     0.1360 
+-------------------------------------------------+
  Factorization step:
    Factorization used: LU
    Time to initialize internal csc:      1.8651 
    Time to initialize coeftab:           0.1180 
    Time to factorize:                    0.8540  (14.92 GFlop/s)
    Number of operations:                      12.74 GFlops
    Number of static pivots:                     0
    Time to solve:                        0.1990 
    - iteration 1 :
         total iteration time                   0.225 
         error                                  5.6329e-10
    - iteration 2 :
         total iteration time                   0.232 
         error                                  7.8823e-13
    Time for refinement:                  0.5650 
________________________________________

CSC Conversion Time: 0.081667
Init Time: 9.034500
Factorize Time: 2.842319
Solve Time: 0.816066
Clean up Time: 0.000000
---------------------------------
Sum: 12.774558
3rav commented 4 years ago

Finally, the version based on scotch works.

Replacing version 6.0.9 with older 6.0.8 helps: I used the newer one before because it compiled without errors and the older version needed to be fixed.