Test case for thin obstructions and enclosed volumes

SusanKilian commented 1 year ago

As mentioned in today's Teams meeting, I have made some progress regarding the treatment of thin obstructions and enclosed volumes in UScaRC. It would be great if you could take a look at the attached 16-mesh test case (please change the PRES line according to your tests).

It contains a number of thin obstructions, some of which cross or abut mesh boundaries, also at the outflow boundary. Some obstructions disappear in the course, others appear (this sometimes affects only individual faces of these obstructions). Some obstructions are only one cell in size, others have hot surfaces, ...

It looks like UScaRC copes with the included volumes (i.e. by filtering out the mean values) and the thin obstructions. In particular, this also applies to the inseparable UScaRC variant. The corresponding OUT file for the inseparable case is also attached and shows small velocity and pressure errors.

I would be very interested to know whether you think this test case is substantial or how we could still extend/complicate it if necessary.

Many thanks, Susan

uscarc16_inseparable_0673 uscarc16_inseparable_out.txt uscarc16_inseparable_fds.txt

rmcdermo commented 1 year ago

Thanks, Susan. At a glance, I would say you should add mesh refinement interfaces and non-rectangular domain components. Also, I would add mesh interfaces that span more than one other mesh.

SusanKilian commented 1 year ago

Hi, Randy, I will try to incorporate these suggestions, thanks.

mcgratta commented 1 year ago

Susan, could you post the file called uscarc16_inseparable_cpu.csv . I would like to see what the relative timing is for the pressure routine.

Also, the out file reports

Time Step    1388   March 24, 2023  16:11:57
       Step Size:    0.730E-03 s, Total Time:       2.00 s
       Pressure Iterations: 1
       Maximum Velocity Error:  0.36E-14 on Mesh 9 at (1,16,3)
       Maximum Pressure Error:  0.10E-08 on Mesh 15 at (16,11,7)
       ScaRC: Iterations   177, Residual  0.81E-08, Rate   0.83E+00

Can I interpret this to mean that your velocity error at the boundary and the solids is 0.36E-14 m/s? That is very very tight. How much of speed up do you get if you set the tolerance to be, say, 1E-6?

SusanKilian commented 1 year ago

Kevin, attached you will find the corresponding ...cpu.csv and also the ...devc.csv, where I measured the pressure error max and the velocity error, confirming these tight measurements. I'll rerun the case with a tolerance of 1E-6

uscarc16_inseparable_cpu.csv uscarc16_inseparable_devc.csv

SusanKilian commented 1 year ago

And here are the corresponding files for a tolerance of 1E-6 uscarc16_inseparable_tol-6_devc.csv uscarc16_inseparable_tol-6_cpu.csv uscarc16_inseparable_tol-6_fds.txt uscarc16_inseparable_tol-6_out.txt

mcgratta commented 1 year ago

Susan -- it appears that both in both simulations the pressure solver takes 63% of the total CPU time. I would expect that the lower tolerance case would take up less CPU time.

SusanKilian commented 1 year ago

Kevin,

I totally agree with your objection and have wondered about it myself. In the meantime, I have repeatedly carried out new series of test calculations on the tolerances 1E-4, 1E-5, 1E-6, 1E-7 and 1E-8 (in each case on an 'empty' AMD ThreadRipper with 24 cores). And these time measurements (with low differences) have been confirmed again and again in a similar form.

Please see here the latest results for the different tolerances: the times for PRES and T_USED from ...cpu.csv and the resulting ratio. The csv-files are attached again.

Tolerance	PRES (s)	T_USED (s)	Ratio
1E-4	~88.7	150.7	~58.9%
1E-5	~91.8	158.9	~57.8%
1E-6	~94.7	163.1	~58.1%
1E-7	~104.9	172.8	~60.7%
1E-8	~108.6	178.8	~60.7%

I haven't quite understood this behavior yet or would also have expected a greater difference. Currently, I am in the process of carrying out the same calculations on the Lunarc cluster in Lund and will report on the resulting time measurements, too.

uscarc16_inseparable_tol-8_cpu.csv uscarc16_inseparable_tol-7_cpu.csv uscarc16_inseparable_tol-6_cpu.csv uscarc16_inseparable_tol-5_cpu.csv uscarc16_inseparable_tol-4_cpu.csv uscarc16_inseparable_tol-8_devc.csv uscarc16_inseparable_tol-7_devc.csv uscarc16_inseparable_tol-6_devc.csv uscarc16_inseparable_tol-5_devc.csv uscarc16_inseparable_tol-4_devc.csv

SusanKilian commented 1 year ago

Here the relations on the Lunarc Cluster:

Tolerance	PRES (s)	T_USED (s)	Ratio
1E-4	~148.8	321.4	~46.3%
1E-5	~153.6	331.0	~46.4%
1E-6	~155.8	327.2	~47.6%
1E-7	~168.4	341.2	~49.4%
1E-8	~173.1	340.4	~50.9%

(what a difference in the total computing times compared to the ThreadRipper!)

Surprisingly, the relation for the PRES/T_USED times is somewhat better. Nevertheless, the differences between the individual tolerances are smaller than expected for me. And no matter how often I repeat the calculations, there is no strictly linear relationship. This is just one of the measurements of many similar repetitions.

I am trying to explore this further with more detailed time measurements within UScaRC. Here I will also measure more precisely the share of the matrix rebuild, which of course must be done per pressure solution in the inseparable version. It is also possible that further optimisations are conceivable here, in that the matrix rebuild is only carried out where density changes really occur.

mcgratta commented 1 year ago

Do the CPU times change significantly between the separable and inseparable versions?

SusanKilian commented 1 year ago

I'll check that ... along with time measurements for the matrix rebuilds

SusanKilian commented 1 year ago

For the inseparable cases the costs for the repeated matrix rebuilds amount to about 7% of the total solver time. As expected, the most time-consuming operations are the matrix-vector products (~40%) and scalar products (~32%).

Here is a comparison of the timings for the inseparable and the separable version for the different tolerances:

Inseparable UscaRC

Tol	PRES(s)	T_USED(s)	Ratio
1E-4	91.08	162.20	56.2%
1E-5	93.71	156.40	59.9%
1E-6	97.82	159.40	61.4%
1E-7	109.40	182.10	60.1%
1E-8	107.00	167.40	63.9%

For the tolerance of 1E-4 the errors look like this: grafik

Separable UScaRC

Tol	PRES(s)	T_USED(s)	Ratio
1E-4	494.40	629.40	78.6%
1E-5	539.60	700.10	77.1%
1E-6	571.30	705.80	80.9%
1E-7	618.80	753.90	82.1%
1E-8	706.10	836.30	84.4%

For the tolerance of 1E-4 the errors look like this: grafik

SusanKilian commented 1 year ago

Here is an even more complicated test case. It now also contains a pipe (made of thin obsts) that snakes through the whole domain in angles (it should be one continuous volume zone). There are also other obstructions passing multiple meshes and conglomerations of thin&solid obstructions of different sizes, some of which collide at mesh boundaries. In various initially closed combinations, individual components sometimes disappear, resulting in a later flow through the remaining obst structures (see, for example, the maroon-coloured rust-shaped combination as well as the indigo-coloured block in front of the outflow).

The 'Pressure Zone Information' indicates 24 zones with volumes in the range of 1.25(-4) to 1.15(-2)

ComplexZones_Geometry

I have run this case with:

separable as well as inseparable UScaRC (for both: different termination criteria for cg-method 1E-2, 1E-4, 1E-6, 1E-8)
FFT as well as ULMAT (for both: default, VELOCITY_TOLERANCE=1E-2, 1E-3, 1E-4 and PRESSURE_TOLERANCE=1E-2)

The FFT runs through, but as expected with an increased number of required pressure iterations for decreasing tolerances (i.e. mean number of 730 for vel_tol=1E-4 and 4850 for pres_tol=1E-2).

ULMAT seems to have problems with this case (see next message for details).

The following picture shows a comparison of computation times, velocity errors and pressure errors for UScaRC-inseparable, UScaRC-separable and FFT.

ComplexZones_Comparison

By construction both UScaRC variants show velocity errors in rounding error range. The pressure error for the inseparable UScaRC corresponds to the termination criterion for the cg-method. As far as CPU times are concerned, it seems to be noticeable that the inseparable UScaRC only needs exactly 1 pressure iteration per pressure solution.

I'll attach the version for FFT-Default. fft_def.txt

SusanKilian commented 1 year ago

And here are my observations for ULMAT:

ULMAT-Default breaks. If I reduce vel_tol to 1E-2, then it needs a lot of pressure iterations (mostly max of 5000).

Setting MINIMUM_ZONE_VOLUME greater than 0.002 gives an 'improperly set-up' error. For the values 0.002 and 0.001 it starts, but finally breaks again (0.002 earlier than 0.001).

The termination at T=1.0 made me suspect that it might have something to do with the device activation via 'clock_3'. And indeed, if the two lateral blocks of the 'Maroon_Rust' obstruction do not disappear, then it runs beyond T=1.0, but later breaks at T=1.5 (see lines 280 to 283 in the attached geometry file, where I already omitted this activation).

This again made me suspect a problem with device activation via 'clock_5'. If I set this time from 1.5 to something greater than T_END=2.0, then it finally runs through if NO settings for MINIMUM_ZONE_VOLUME are used.

ulmat_test.txt

mcgratta commented 1 year ago

Nice test case. We'll take a look at the ULMAT case.

mcgratta commented 1 year ago

I added the test case to the verification suite. It is called random_obstructions_fft.fds in Pressure_Solver. At the moment, I am just testing that the FFT solver can reach its desired velocity tolerance. I have not had time to explore more, but at least we have this good test case.

firemodels / fds

Test case for thin obstructions and enclosed volumes #11593