NCAR / DART

Data Assimilation Research Testbed
https://dart.ucar.edu/
Apache License 2.0
184 stars 139 forks source link

Update CLM-DART shell scripts for Derecho #610

Closed braczka closed 6 months ago

braczka commented 6 months ago

Use case

The CLM-DART shell scripts need to reflect updated paths and options compatible with Derecho, supplanting the decommissioned Cheyenne

Is your feature request related to a problem?

Not really, although users have reported sluggish responses with CLM-DART on Derecho in reading CAM6 reanalysis. Need to determine whether this is related to particular Derecho compatible CESM tag, or related to campaign storage or some other yet unidentified issue.

Describe your preferred solution

Implement CLM-DART tutorial test case on Derecho using CTSM repository tag release-cesm2.2.03, closest to previously used tag (release-cesm2.2.0). Preliminary work suggests these tags are compatible with existing SourceMods. This would require an update to the CLM documentation as well.

braczka commented 6 months ago

I have a working version of the CLM-DART shell scripting for Derecho. I checked out a new Derecho compatible ctsm tag (release-cesm2.2.03) that was very close to the cesm tag I was using for cheyenne (release-cesm2.2.0). The tag is close enough where I didn't have to modify the DART SourceMods at all. Preliminary comparison with the CLM-DART tutorial suggests nearly identical behavior between cheyenne and derecho versions.

I encountered sluggish behavior when starting the clm runs during the reading of the CAM6 reanalysis. This sluggish behavior seems to have coincided with the migration of the CAM6 to the RDA campaign directory. This behavior occurs whether running on derecho or cheyenne. When using the "datm.streams.txt.CPLHISTForcingcomplete" stream files it took about 25-30 minutes of wall time just to initialize the CAM files (all years 2011-2019). If I used the "datm.streams.txt.CPLHISTForcingsingle_year" stream file (which initializes only one year, 2011) things sped up considerably (full tutorial run time, 14 minutes on cheyenne and 10 minutes on derecho). The 'complete' and 'single_year' datm streamlist files were created to address runtime issues when initializing multiple years versus single years runs. In recent memory, however, the runtime differences between the complete and single_year streamlist files wasn't as distinct. This sluggish behavior only occurs during the first time step, whereas all subsequent forecast steps did not experience this behavior.

Will issue PR soon -- currently running some tests where CAM reanalysis files are on scratch directory to better diagnose sluggish runtime behavior.

braczka commented 6 months ago

Can confirm reading CAM reanalysis files from /glade/campaign/collections/ directory adds 20 minutes of processing walltime during the initialization of CLM-DART simulation, as compared to reading same CAM files from /glade/derecho/scratch. Delay occurs during file reading step during CLM simulation (subroutine shr_dmodel_readstrm). Will add option in CLM shell scripting to use either single_year or complete datm streamlist files to help reduce wall time.

hkershaw-brown commented 6 months ago

fixed in #611