Closed nR3D closed 6 months ago
Well done. I will go through your modification.
@junwei-jiang Could you also have some comments?
the main slowdown is from interval_computing_time_step =768.747579194 ?
I'm curious about the precision difference. I mean, how much precision is lost when changing from double to float, and whether this is acceptable. I think the computational cost of double may be too high.
@Xiangyu-Hu the main slowdown is from interval_computing_time_step =768.747579194 ?
Since timestep computation and configuration updating are executed concurrently, that slowdown is probably determined by both. In general floating point numbers are used by all steps (aside from writing, where single/double precision is not relevant) so I would expect a similar slowdown for all methods. I am now running the same computation again, separating timestep and configuration computations, so we can better distinguish the slowdown in different intervals. I will post the results as soon as they are ready.
@junwei-jiang I'm curious about the precision difference. I mean, how much precision is lost when changing from double to float, and whether this is acceptable. I think the computational cost of double may be too high.
Regression tests are still passing with floats, so double precision is not needed. But it is worth to have the possibility to enable it whenever an higher precision is required.
By default, device and host precision is set to be the same, so SPHINXSYS_USE_FLOAT
may be toggled to use single precision for both.
@Xiangyu-Hu the main slowdown is from interval_computing_time_step =768.747579194 ?
Since timestep computation and configuration updating are executed concurrently, that slowdown is probably determined by both. In general floating point numbers are used by all steps (aside from writing, where single/double precision is not relevant) so I would expect a similar slowdown for all methods. I am now running the same computation again, separating timestep and configuration computations, so we can better distinguish the slowdown in different intervals. I will post the results as soon as they are ready.
@junwei-jiang I'm curious about the precision difference. I mean, how much precision is lost when changing from double to float, and whether this is acceptable. I think the computational cost of double may be too high.
Regression tests are still passing with floats, so double precision is not needed. But it is worth to have the possibility to enable it whenever an higher precision is required. By default, device and host precision is set to be the same, so
SPHINXSYS_USE_FLOAT
may be toggled to use single precision for both.
I would not worried about using float as the lead term to determine accuracy is numerical discretization algorithm other than double or single precision.
Double:
Total wall time for computation: 2269.544049414 seconds.
interval_computing_time_step =46.734129965
interval_computing_fluid_pressure_relaxation = 653.920405645
interval_updating_configuration = 1458.222392889
interval_writing_files = 108.284276538
Float:
Total wall time for computation: 1376.905903631 seconds.
interval_computing_time_step =40.042162188
interval_computing_fluid_pressure_relaxation = 580.866289297
interval_updating_configuration = 610.196198424
interval_writing_files = 144.182650995
Benchmarks are still using 300'000 particles, but this time the server I am using to benchmark is partially busy so runtimes are slower than before, therefore I run both cases again to have a proper comparison. Overall, configuration updating is where the execution slows down the most. I could run some profiling to find hotspots within configuration kernels. In particular, floating-point precision is relevant inside smoothing kernel calculations for neighbors, where different SYCL math and geometric functions are called, I suppose those are the reason for such slowdown.
Changes
DeviceReal
can now be set todouble
Benchmarks
With 300'000 particles, using double-precision makes the simulation 86.4% slower than with single-precision
Double:
Float: