Open lu-kas opened 2 months ago
I'll discuss this with Glenn. The change was made to speed up Smokeview data loading. There might be a faster way to get you your information.
Currently, the .bnd
files have the time when the bounds were last changed. Would it help if the .bnd
file had the latest time?
Basically, I need the information about the (simulation) time, at which data was written out -- without parsing the data file itself. This could be explicit (like it was in the .bnd
files) or can be implicit, i.e. the first and last output time and a time interval, if it is constant.
How about a .times
file, which includes the explicit time steps for the output types SLCF, BNDF, etc. It could be either a file per type or a single file with (line) entries like BNDF 12.2\n
or SLCF 13.4\n
, so just appending the time info whenever data gets written out. I can do the sorting then by myself.
Or as an alternative, revive the "old / previous" .bnd
file structure, i.e. with all boundary information, and have, e.g., the first line to be the info that Glenn needs. In the current implementation, you have to read / check / replace the content of the .bnd
file at every data dump anyway. Glenn has to open the files anyway but could limit the read to the first line. With an according comment line, the file structure (i.e. first line is special) would be still "human interpretable".
Let me give you some background. In the past few years, we have been running calculations on computers with thousands of cores, and we've seen the number of output files is growing huge. Then also Smokeview struggles to read all the data and process the images in a reasonable amount of time. So we have been trying to streamline Smokeview and reduce the number of files. But also, there is another problem. We have a new computer cluster at NIST with 36 nodes, 64 cores per node. It works well, but there is still one nagging issue that we have not resolved. that is, sometimes small jobs that get split across nodes hang. This only happens during periods of high Infiniband/network traffic. It might be an MPI issue, it might be an FDS issue, it might just be our new machine, or it might have something to do with I/O. We just do not know, but my preference is to reduce I/O and the number of times FDS has to interact with the OS. We're trying to cut down on the small bits of I/O like file writes and INQUIRES. In essence, we want to let FDS just do its calculation with the least amount of interference.
So my question to you is how critical is this? Many people, including ourselves, have suggested all sorts of ways to interact with FDS during its run. But any interaction has a price, and this price keeps going up as the jobs grow larger. Fortran is great for intensive floating point applications, but it is not designed for efficient interaction with the end user.
Yes, I fully understand this issue. I have spent a lot of time with parallel I/O ;-)
However, one of my above proposals (.times
file) would require just a single file written by a single MPI rank. The file would not have to be reopened every time, maybe a flush could be reasonable, but kept open during the whole simulation. Thus, the overhead would be constant w.r.t. the number of MPI ranks.
Anyway, let's pause here and let me do some testing / benchmarking on our side.
Thanks.
Dear all,
in the past, we have used the BND files to gather information about the output times at which, e.g., SLCF and BNDF, were written out. However, since FDS version 6.9.0, the output is limited to only the global (i.e. over time) information, see
https://github.com/firemodels/fds/blob/889da6ae08d08dae680f7c0d8de66a3ad1c65375/Source/dump.f90#L6028C3-L6049C35
Would it be possible to have the information about the available times stored somewhere, best again in the BND files?
Side note, we know that in general this information is available in the data files themselves. Yet, we do lazy loading, which means, that we do not read in all data, but only if needed. With the BND files, we did not have to touch the data files for getting the output time information.
Thanks, Lukas