CH-Earth / summa

Structure for Unifying Multiple Modeling Alternatives:
http://www.ral.ucar.edu/projects/summa
GNU General Public License v3.0
79 stars 103 forks source link

compute cpu time for each HRU and time step #465

Closed martynpclark closed 2 years ago

martynpclark commented 3 years ago

Make sure all the relevant boxes are checked (and only check the box if you actually completed the step):

martynpclark commented 3 years ago

This computes the CPU time spend on each HRU and time step. While the elapsed wall clock time is desired by users, the precision may be as low as 0.01 seconds. In many cases, it will be preferable to look at metrics like the number of flux calls. Nevertheless, the elapsed wall clock time is provided to meet user requests.

arbennett commented 3 years ago

@martynpclark this looks good and I think is a nice feature to add, but I have one request - could you add a quick entry to the docs/whats-new.md list under a new header Develop?

martynpclark commented 2 years ago

Updated whats-new file

andywood commented 2 years ago

Can this be included as a user option? ie lumped in with 'verbose' option? For a 50 yr 3-hourly run of a CONUS model at intermediate scale (~100k units) this leads to a new set of 13.5 billion time calculations included. I don't know the cost of this, but I'd guess that for most applications this level of info is not of interest, and could be shut off. That said it might lead to 13.5 if statements which also have a cost. If the cost is trivial, ie <0.5% of the overall run time, I'd say go ahead. I have no idea.

wknoben commented 2 years ago

According to this, execution time for cpu_time() is not negligible for very large numbers of computations.

Some quick (online Fortran compiler) testing shows that:

program cputimer
  implicit none
  integer :: i
  integer, parameter :: n = 1000000
  real(8) :: t1, t2, t
  call cpu_time(t1)
  do i = 1, n
    call cpu_time(t)
  end do
  call cpu_time(t2)
  write(*,*) (t2-t1)/n
end program

has an average execution time per iteration of 1.099E-6 s. If this translates directly from this online tool to regular compiled Fortran code, we're looking at an extra 50 * 365.25 * 8 * 100000 * 1.099E-6 ~= 4.5 hr of total time added to these CONUS runs. On a per-HRU basis this is not excessive though, at approx. 0.16 s per HRU

I also looked at the time cost of an IF-statement like so:

program iftimer
  implicit none
  integer :: i
  integer, parameter :: n = 1000000
  logical, parameter :: calcTime = .FALSE.
  real(8) :: t1, t2, t
  call cpu_time(t1)
  do i = 1, n
    IF (calcTime) THEN
      call cpu_time(t)
    END IF
  end do
  call cpu_time(t2)
  write(*,*) (t2-t1)/n
end program

This gives an average execution time per loop iteration of 2.032E-9 s, meaning approximately 30 s added to the CONUS run (approx. 0.0003 s per HRU).

On a per-HRU basis either approach is pretty much negligible but for very large domains these calls may start to add up. As far as I can see there are two conflicting principles:

I'm not sure what to suggest but figured I'd share the analysis at any rate to put some (estimated) numbers to this problem.

andywood commented 2 years ago

Great analysis, Wouter -- thanks for digging into this. Given that we mostly run our models in big split-domain (or some kind of parallelization) modes, I don't think it's a big enough hit to object to adding it, and without the if() statement. For instance, for a 20-node run of a CONUS domain it might end up adding 20-30s, maybe 2 min at much higher resolution. Happy with what others think.

On Tue, Oct 12, 2021 at 12:54 PM Wouter Knoben @.***> wrote:

According to this https://stackoverflow.com/questions/22122494/is-there-a-cpu-time-performance-penalty-in-fortran, execution time for cpu_time() is not negligible for very large numbers of computations.

Some quick (online Fortran compiler) testing shows that:

program cputimer implicit none integer :: i integer, parameter :: n = 1000000 real(8) :: t1, t2, t call cpu_time(t1) do i = 1, n call cpu_time(t) end do call cpu_time(t2) write(,) (t2-t1)/n end program

has an average execution time per iteration of 1.099E-6 s. If this translates directly from this online tool to regular compiled Fortran code, we're looking at an extra 50 365.25 8 100000 1.099E-6 ~= 4.5 hr of total time added to these CONUS runs. On a per-HRU basis this is not excessive though, at approx. 0.16 s per HRU

I also looked at the time cost of an IF-statement like so:

program iftimer implicit none integer :: i integer, parameter :: n = 1000000 logical, parameter :: calcTime = .FALSE. real(8) :: t1, t2, t call cpu_time(t1) do i = 1, n IF (calcTime) THEN call cpu_time(t) END IF end do call cpu_time(t2) write(,) (t2-t1)/n end program

This gives an average execution time per loop iteration of 2.032E-9 s, meaning approximately 30 s added to the CONUS run (approx. 0.0003 s per HRU).

On a per-HRU basis either approach is pretty much negligible but for very large domains these calls may start to add up. As far as I can see there are two conflicting principles:

  • On the one hand, only executing computations when required makes sense;
  • On the other hand, the code may be kept simpler if we don't excessively use if statements and flags.

I'm not sure what to suggest but figured I'd share the analysis at any rate to put some (estimated) numbers to this problem.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/CH-Earth/summa/pull/465#issuecomment-941294082, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABIKARL3FQ2CYX4I4L6VNF3UGR74PANCNFSM46VP3B3A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.