Exawind / amr-wind

AMReX-based structured wind solver
https://exawind.github.io/amr-wind
Other
103 stars 78 forks source link

Excessive netCDF sampling warning #1104

Closed rybchuk closed 1 week ago

rybchuk commented 2 weeks ago

Bug description

I am running a simulation on Kestrel in which I have 345 instances of sampling in the simulation. During initialization of the simulation, I receive the warning WARNING: Sampling: netcdf output will negatively impact performance once per instance, which sounds reasonable to me.

However, I get this warning during every timestep for each sampler as well, which means I need to scroll through 345 warnings between every timestep. My most recent simulations are running substantially slower today on Kestrel than they were a few months ago on Kestrel. I can't say with confidence that all these print statements are tied to this behavior, but I feel like it could be having an impact.

Steps to reproduce

I ran a simulation with a sampling instance that uses netCDF.

Steps to reproduce the behavior:

  1. Compiler used
    • [ ] oneapi (Intel)
  2. Operating system
    • [ ] Linux
  3. Hardware:
    • [ ] CPU
  4. Machine details ():
    • Kestrel

Expected behavior

The warning should occur during initialization but not during every timestep.

AMR-Wind information

==============================================================================
                AMR-Wind (https://github.com/exawind/amr-wind)

  AMR-Wind version :: v2.1.0-23-g4d84bbdc-DIRTY
  AMR-Wind Git SHA :: 4d84bbdcc418c53482fdc4ecf23b4cd01a9f700c-DIRTY
  AMReX version    :: 24.05-20-g5d02c6480a0d

  Exec. time       :: Mon Jun 17 06:48:43 2024
  Build time       :: Jun 10 2024 10:19:01
  C++ compiler     :: IntelLLVM 2023.2.0

  MPI              :: ON    (Num. ranks = 3072)
  GPU              :: OFF
  OpenMP           :: OFF

  Enabled third-party libraries:
    NetCDF    4.9.2

           This software is released under the BSD 3-clause license.
 See https://github.com/Exawind/amr-wind/blob/development/LICENSE for details.
------------------------------------------------------------------------------
marchdf commented 2 weeks ago

I doubt print statements of this sort make a difference to performance. You can comment out the warning and check. If you find this too verbose and others think so to, feel free to make a PR and move the warning to the init. I am happy to help you accomplish this if you would like.

rybchuk commented 1 week ago

Yeah fortunately/unfortunately, I removed the print statements and I'm still seeing similar performance. Once in a while, my WallClockTime of a single timestep goes up by a factor of 10x for some reason. I suspect this is a hardware issue.

image

I'm focused on a workaround for this 10x behavior, and the print statements aren't diminishing the performance, so I'll close out this issue. If others also request that the warning be moved to the initialization part of AMR-Wind, we can re-open this issue.