firemodels / fds

Fire Dynamics Simulator
https://pages.nist.gov/fds-smv/
Other
636 stars 613 forks source link

efficiency calcualtion for FDS+EVAC #1031

Closed gforney closed 9 years ago

gforney commented 9 years ago
Please complete the following lines...

Application Version: 5.4.3 FDS_MPI
SVN Revision Number: 5210
Compile Date: 12/3/2009
Operating System: Windows XPSP3 2.33Core2Quad32-bit 4GbRam 1GbVRam

Describe details of the issue below:
I'm running an FDS+EVAC MPI project - ok, parallel processors (Core2 
Quad) and the scenario is also fairly complex in nature, and I'm using 
the "-channel ssm - localonly" instruction for MPI making all four 
processors work at 100%... also, I have 4Gb RAM that I'm using only 
3.57Gb of it. My smokeview 3D Smoke Slices setting is using my GPU 
(1Gb nVidea 9800 video card). 

When I go into the filename.out file to do this efficiency calculation 
per the "efficiency of the parallel calculation" instructions of the 
user manual, I notice something extremely striking. In a certain time 
step, the three fire meshes use about a 10.50-11.15s CPU/step while 
for the evac meshes it's the same for all and each use exactly a 650s 
CPU/step, a heck of a lot more than the fire meshes. I also notice the 
Total CPU for the fire meshes are at about 55-58 min while for the 
evac meshes it's at exactly the same for all at 33 min, a heck of a 
lot less than the fire meshes. What gives!?! 

Now if I do the efficiency calculation only using the fire meshes, I 
get my usual 99.9% efficiency. However, if I include the evac meshes 
it drops to nearly zero... 

Is there something wrong, or should I just ignor the evac mesh stuff 
in the output file for efficiency purposes?

btw - The attached file complete's its run at 20 minutes of simulation...

Original issue reported on code.google.com by BlackstoneEngineering on 2010-04-09 13:31:44


gforney commented 9 years ago
What's with the guy hanging from the ceiling? And the other guy who's head is in the

floor?

Original issue reported on code.google.com by mcgratta on 2010-04-09 14:03:04


gforney commented 9 years ago
that never happens when I run it on my computer... but that is wild! Exactly how 
would one do that on purpose? Hmmmmm, let me think: The positions that these two 
upside down guys are at is where I have two sensors... probably an issue with the 
version of somkeview you use is somehow using agent avatars as the sensor avatars -

that's funny :-D

Original issue reported on code.google.com by BlackstoneEngineering on 2010-04-09 14:27:06

gforney commented 9 years ago
one sensor is measuring w-velocity and the other is measuring CO2 - I'm still 
laughing 'cos that just looks so awesome, heh, heh...

Original issue reported on code.google.com by BlackstoneEngineering on 2010-04-09 14:30:12

gforney commented 9 years ago
I shouldn't be laughing this hard - I'm still recovering from radical lung surgery 
(but is JUST soooo funny). 

Anyway, the version of smokeview I'm using is 5.4.8 (5220 32-bit) dated 12/3/2009,

and I've edited my SVO file only adding the sidewall sprinklers...

Original issue reported on code.google.com by BlackstoneEngineering on 2010-04-09 14:36:04

gforney commented 9 years ago
Lucky for me that I don't take care of either Evac or Smokeview. I'll pass this on 
to Glenn and Timo.

Glenn, FYI, I'm using SVN 5998 (April 5, 2010), 32 bit Windows version. FDS 5.5.0,

SVN 6004, linux, 32 bit, MPI version. The files are at 
~mcgratta/VERIFICATION/TEST_CASES/PorterdaleMill

Original issue reported on code.google.com by mcgratta on 2010-04-09 14:47:06

gforney commented 9 years ago
I'll take a look at it.  Any thing different about the sensor name where the upside
down person is located.  Did you say you made some edits to the objects.svo file? 
IF
so, could you upload it.

Original issue reported on code.google.com by gforney on 2010-04-10 22:00:14

gforney commented 9 years ago
Glenn and Kevin,

All I did was copy the info for the upright sprinkler in the svo file and renamed it

sidewall and changes some of the colors for it - that's all folks! Without the 
sidewall definition the sprinkler resorts to just a default sprinkler. But this 
should have nothing to do with the sensor/upside down agent anomaly... which is 
TOTALLY funny but no worries as it doesn't happen with the older version of 
smokeview I'm using and that I am satisfied with.

Anyway, look in the fds file for the two sensors - one is the only temp and the 
other is the only w-velocity. I think they're like 'sensor6' and 'sensor7' - no 
weird name or anything...

Or, better yet we could just delete the two sensors in the fds file and get back to

the main question which seems to be getting sidetracked, and that is the parallel 
effeciency calculation aberrations in the out file... but, here is the svo file just

to make things interesting ;-)

Original issue reported on code.google.com by BlackstoneEngineering on 2010-04-11 02:32:36


gforney commented 9 years ago
The three sensors that are drawn as upside down people do not have a PROP_ID assigned
on the &DEVC line in your FDS input file.  The rest of your &DEVC lines do have
PROP_IDs assigned.  I use the same code to draw all smokeview objects (objects
defined in objects.svo) whether they be sprinklers, people, tree etc.  When a &DEVC
device does not have a PROP defined, I assign one - things must be getting mixed up.

The work around for now (you will need this when you use the new smokeview) is to set
up a property for these three &DEVC lines.  Not sure why it works with the earlier
smokeview but now this one. 

I'll look at this some more.

Original issue reported on code.google.com by gforney on 2010-04-11 13:15:17

gforney commented 9 years ago
I found the problem.  I was using the same variable for both people drawing and
sensor drawing (when a PROP was not defined).  I need to check to make sure the same
error does not occur when I draw particles as smokeview objects.

Original issue reported on code.google.com by gforney on 2010-04-11 14:21:39

gforney commented 9 years ago
kevin, try your test case with the smokeview I just posted (revision 6041).  Should
work now.

Original issue reported on code.google.com by gforney on 2010-04-11 14:40:08

gforney commented 9 years ago
Glad y'all found the funny bug, heh, heh...

Original issue reported on code.google.com by BlackstoneEngineering on 2010-04-11 14:52:15

gforney commented 9 years ago
Meanwhile, back to parallel efficiency - I'm only getting about 23 seconds of 
simulation a day with my computer setup and this big simulation, much less than the

previous version of the simulation - the main difference is I went from two exhaust

fans with two makeup air holes to eleven exhaust fans and ten makeup air holes. I 
guess the extra 'flows' really eat up the processing time, eh?

Original issue reported on code.google.com by BlackstoneEngineering on 2010-04-11 15:43:01

gforney commented 9 years ago
Hi, I'm back at the office, I skied (cross country) more than 600 km.

The timings: Yes, there seems to be problems with "CPU/step"
when you have evacuation meshes present. The time spent in
the evacuation calculation (after the initialization) you 
can see by checking the "EVAC" row at the "CPU Time
Usage, Mesh X" output at the bottom of the CHID.out file.
The "EVA1" etc are some sub times spent in the evac.f90
routines and these do not add up to "EVAC". The sum of "VELO"
and "PRES" times (of the evacuation meshes) gives roughle the
initialization time of the evacuation meshes (the calculation
of the guiding evacuation flow fields). The time "EVAC" for a
mesh is the time the programme uses to move the agents around.

Nota that all the evacuation meshes are run as a single process
if you are doing a MPI calculation. Add up the "EVAC" times of
the main evacuation meshes (these are only printed for main
evacuation meshes, the additional flow fields do not have
any agents) to get the CPU time spent in the evacuation 
movement part. Add the "PRES" and "VELO" times of all evacuation
meshes to get the initialization time of the evacuation calculation.

I should try to get the timings (CPU/step) correct for the CHID.out
output.

TimoK

Original issue reported on code.google.com by tkorhon1 on 2010-04-12 10:00:22

gforney commented 9 years ago
Glenn -- the humans appear to be properly located now.

Original issue reported on code.google.com by mcgratta on 2010-04-12 12:20:01

gforney commented 9 years ago
TimoK: "Yes, there seems to be problems with "CPU/step" when you have evacuation 
meshes present... ...I should try to get the timings (CPU/step) correct for the 
CHID.out output."

Thanks, Timo ~ and wow, skiing 600km - I'm impressed!

BatGirl

PS - so for now as far as doing the "efficiency of the parallel calculation" per the

user manual (since there is no end of the 'filename.out' file yet to do as you 
suggest above as the simulation is still running...), it is safe to say I can ignore

the CPU/step for the EVAC meshes (since their Total CPU time is immensely less that

the Total CPU time for the fire meshes) in doing this calculation?

Original issue reported on code.google.com by BlackstoneEngineering on 2010-04-12 13:05:36

gforney commented 9 years ago
Kevin: "Glenn -- the humans appear to be properly located now."

Oh, and they were so funny to look at... party pooper!

Original issue reported on code.google.com by BlackstoneEngineering on 2010-04-12 13:06:47

gforney commented 9 years ago

> it is safe to say I can ignore the CPU/step for the EVAC meshes 

Well, yes, but sometimes it is not so simple. It seems that my
test case (1 fire mesh + some evacuation meshes) does not output
correct CPU/step nor Total CPU. Same is true for a case, where
there are just evacuation meshes. So, for timings you should 
just run the fire meshes.

The first idea is to check the ICYC logic in the Fortran source
code. I have ICYC < 0 for the initialization of the evacuation
meshes and the fire calculation always has ICYC >= 0 and there
might somewhere funny things with this. Or then something else is
broken.

Timo

PS. 600 km in 14 days is just about 43 km per day.

Original issue reported on code.google.com by tkorhon1 on 2010-04-12 14:13:40

gforney commented 9 years ago
I noticed a bug in the "CPU/step" timings for the fire meshes
also. The CPU/step values were not calculated correctly for
a case, where there are more than one mesh, even in plain
fire calculation. So, the fire calculation timings were
also wrong: CPU/step and Total CPU were wrong, but the
final timings printed at the bottom lines of the .out file
(CPU Time Usage, Mesh xxx) were correct. 

The SVN Revision No. : 6062 has the bug fix:
The T_SUM is now summed inside the NM loop.
before it has its own NM-loop before the
T_PER_STEP(NM) NM-loop (two NM-loops). Now
there is just one NM-loop. This way I had
not to change T_SUM (scalar) to a T_SUM(NM)
array.

DO NM=1,NMESHES
   IF (PROCESS(NM)/=MYID) CYCLE
   T_SUM = 0._EB
   SUM_LOOP: DO I=2,N_TIMERS_DIM
      T_SUM = T_SUM + TUSED(I,NM)
   ENDDO SUM_LOOP
   NECYC          = MAX(1,NTCYC(NM)-NCYC(NM))
   T_PER_STEP(NM) = (T_SUM-T_ACCUM(NM))/REAL(NECYC,EB)
   T_ACCUM(NM)    = T_SUM
   NCYC(NM)       = NTCYC(NM)
ENDDO

I'm planning to correct the Evac-timings also, now the
CPU/Step and Total CPU are too large for evacuation 
meshes. The T_SUM loop sums up all the sub timers (see
the sub timers listed at the bottom of the .out file).
For evacuation meshes EVAC includes some of the 
timers EVA1, EVA2, and EVA3. So, my plan is to add
an logical array, say L_ACCUM(1:N_TIMERS_DIM), which can be
use with the SUM Fortran function to mask the sum
T_SUM=SUM(TUSED(:,NM), MASK=L_ACCUM). Well, might
need one L_ACCUM for fire meshes and one for evacuation
meshes or something like that.

TimoK

Original issue reported on code.google.com by tkorhon1 on 2010-04-14 09:34:23

gforney commented 9 years ago
Now the version "SVN Revision No. : 6067" has the
CPU/step timings corrected also for the evacuation
meshes. Changed also a little bit the names of the
evacuation mesh timing labels:

Kevin, FYI: The (plain) fire mesh calculation timings
were also broken, see the comment 18 above.

EVAC: time spent in evacuation (some global evacuation
      initialization times are not included)
 FOR: time spent in the force loop for this mesh
 P2P: time spent inside the force loop calculating
      agent-agent interactions
 MOV: time spent in the move loop for this mesh

Note: the times FOR, P2P, MOV are included in EVAC.

TimoK

Original issue reported on code.google.com by tkorhon1 on 2010-04-14 13:51:08

gforney commented 9 years ago
Timo, thanks. This will be useful as we start timing FDS 5 and 6 calculations.

Original issue reported on code.google.com by mcgratta on 2010-04-14 14:35:05

gforney commented 9 years ago
I managed to do the evacuation timings such that I did
not need to itroduce any new arrays (see comment 18 and
the plan to introduce L_ACCUM or something like that).
There were already the dimensions: N_TIMERS_DIM,
N_TIMERS_FDS, and N_TIMERS_EVAC. And I had the evac-
timers in a good order, the first one was EVAC and
then came the rest (FOR,P2P,MOV). So I just sum
up to the timer EVAC to get the correct sum
(i.e., N=2,N_TIMERS_EVAC-3). This could be a 
nice tactics for other timings also, so put
all "extra timers" to the end, so that these are
not summed up to the total time (they are
included already in some other timer).

(Why to add this comment: Well, to document the
changes made correctly.)

TimoK

Original issue reported on code.google.com by tkorhon1 on 2010-04-15 10:37:36

gforney commented 9 years ago
This is an old fixed issue, so now it is closed.

TimoK

Original issue reported on code.google.com by tkorhon1 on 2011-03-24 15:17:03