phpisciuneri / tg

A two dimensional Taylor-Green Vortex with chemical reaction for assessing load balancing tools
0 stars 0 forks source link

Run with network contention of 1 #2

Closed phpisciuneri closed 6 years ago

phpisciuneri commented 7 years ago

I screwed up with loading modules and the job never got off the ground.

Lmod has detected the following error: The following module(s) are unknown:
"boost@1.58.0%intel@15.0.3-ie5nth5"

   Please check the spelling or version number. Also try "module spider ..."

Lmod has detected the following error: The following module(s) are unknown:
"hwloc@1.11.0%intel@15.0.3-egkbymq"

   Please check the spelling or version number. Also try "module spider ..."

Lmod has detected the following error: The following module(s) are unknown:
"openmpi@1.8.6%intel@15.0.3-ukxwbpm"

   Please check the spelling or version number. Also try "module spider ..."

Lmod has detected the following error: The following module(s) are unknown:
"trilinos@12.0.1%intel@15.0.3-736pgl4"

   Please check the spelling or version number. Also try "module spider ..."

Fixed and resubmit to queue just now and now I am behind a series of multiple day long jobs:

php8@login0a:~/run/mpi/tg/paragon/s8$ squeue -l
Wed Dec 21 15:52:57 2016
             JOBID PARTITION     NAME     USER    STATE       TIME TIME_LIMI  NODES NODELIST(REASON)
             14669   haswell       tg     php8  PENDING       0:00   2:00:00     15 (Resources)
             14670   haswell       tg     php8  PENDING       0:00   2:00:00     15 (Resources)
             14610   haswell 0.003_0.    abd43  PENDING       0:00 3-00:00:00      8 (Resources)
             14611   haswell 0.003_0.    abd43  PENDING       0:00 3-00:00:00      8 (QOSGrpCpuLimit)
             14612   haswell 0.003_0.    abd43  PENDING       0:00 3-00:00:00      8 (QOSGrpCpuLimit)
             14613   haswell 0.003_0.    abd43  PENDING       0:00 3-00:00:00      8 (QOSGrpCpuLimit)
             14668   haswell mn_4_111   kas389  PENDING       0:00 1-10:00:00      2 (Resources)
             14608   haswell 0.002_0.    abd43  RUNNING    3:52:37 3-00:00:00      8 n[199-206]
             14609   haswell 0.002_0.    abd43  RUNNING    3:52:37 3-00:00:00      8 n[207-213,219]
             14667   haswell mn_4_111   kas389  RUNNING    2:47:05 1-20:00:00      2 n[226-227]
             14666   haswell mn_4_111   kas389  RUNNING    2:47:35 1-10:00:00      2 n[224-225]
             14661   haswell vasp-cin    abb58  RUNNING    8:27:06 6-00:00:00      5 n[214-218]
             14665   haswell cell_opt    abb58  RUNNING    3:05:24 6-00:00:00      4 n[220-223]
phpisciuneri commented 6 years ago

Done and results are in paper: See https://github.com/phpisciuneri/dynamic-partitioning-paper