geoschem / gcpy

Python toolkit for GEOS-Chem. Contains basic plotting scripts, plus the suite of GEOS-Chem benchmarking utilities.
https://gcpy.readthedocs.io
Other
50 stars 24 forks source link

Feature request: Add a script to scrape timing info from benchmark simulation log files #312

Closed yantosca closed 3 months ago

yantosca commented 5 months ago

Name and Institution (Required)

Name: Bob Yantosca Institution: Harvard + GCST

Confirm you have reviewed the following documentation

New GCPy feature or discussion

Currently, we have to copy GEOS-Chem Classic and GCHP timing information from log files into a spreadsheet. It would be great if we could have a script to scrape this information and put it into a table.

The script could take an existing file with a table as an optional input argument and append to it. And it could take either one log or a list of logs to create a new table with.

GCHP log file info looks like this:

image

GEOS-Chem Classic timers info looks like this:

===============================================================================
G E O S - C H E M   T I M E R S

  Timer name                       DD-hh:mm:ss.SSS     Total Seconds
-------------------------------------------------------------------------------
  GEOS-Chem                     :  00-06:17:48.512         22668.512
  HEMCO                         :  00-00:34:44.061          2084.061
  All chemistry                 :  00-02:38:10.404          9490.405
  => Gas-phase chem             :  00-01:32:16.579          5536.579
  => Photolysis                 :  00-00:12:25.899           745.899
  => Aerosol chem               :  00-00:49:34.497          2974.498
  => Linearized chem            :  00-00:00:29.576            29.577
  Transport                     :  00-00:27:46.755          1666.755
  Convection                    :  00-00:41:14.616          2474.616
  Boundary layer mixing         :  00-00:50:08.453          3008.453
  Dry deposition                :  00-00:00:51.259            51.259
  Wet deposition                :  00-00:16:32.521           992.521
  Diagnostics                   :  00-00:38:41.675          2321.675
  Unit conversions              :  00-00:31:20.192          1880.193

Looking for volunteers!

yantosca commented 5 months ago

For GEOS-Chem Classic, timing information is also saved to a JSON file, so it would be easy to parse that with Python.

lizziel commented 4 months ago

The GCHP log file also includes each gridded component broken down into further timing. The section that includes GEOS-Chem looks like this (example is transport tracers simulations hence chemistry is so low):

Screenshot 2024-04-12 at 1 18 43 PM

Min, mean, and max are included to show the range of times across CPUs. Inclusive means it includes subroutines called , while exclusive means it does not (e.g. timer is stopped for the sub-processes).

yantosca commented 4 months ago

See PR #319 for sample output. I will also try to scrape the timings of the GCHP component.

yantosca commented 3 months ago

This has now been completed, so we can close this issue.