Closed cseppan closed 5 months ago
Discussion of the I/O API M3UTILIO module and how to convert existing code https://cmascenter.org/ioapi/documentation/all_versions/html/M3UTILIO.html
SMOKE v5.0 compiled successfully with Carlie's modified tmpbeis4.f Have not checked how fast SMOKE run with the update
Summary of Carlie's updates to tmpbeis4.f:
IF( PX_VERSION ) THEN
ALLOCATE( SOILM( NCOLS, NROWS ), STAT=IOS )
CALL CHECKMEM( IOS, 'SOILM', PROGNAME )
ALLOCATE( SOILT( NCOLS, NROWS ), STAT=IOS )
CALL CHECKMEM( IOS, 'SOILT', PROGNAME )
ALLOCATE( SOILT2( NCOLS, NROWS ), STAT=IOS )
CALL CHECKMEM( IOS, 'SOILT2', PROGNAME )
ALLOCATE( ISLTYP( NCOLS, NROWS ), STAT=IOS )
CALL CHECKMEM( IOS, 'ISLTYP', PROGNAME )
END IF
was modified to (lines 524 - 528 in updated tmpbeis4.f):
IF (PX_VERSION) THEN ! line 480
....
ALLOCATE( SOILM( NCOLS, NROWS ),
& SOILT( NCOLS, NROWS ),
& SOILT2( NCOLS, NROWS ),
& ISLTYP( NCOLS, NROWS ), STAT=IOS )
CALL CHECKMEM( IOS, 'SOILM...ISLTYP', PROGNAME )
DO I = 1, NCOLS
DO J = 1, NROWS
C............................. If switch equal to 0 use winter normalized emissions IF( SWITCH_FILE ) THEN IF( SWITCH( I,J ) == 0 ) THEN SEMIS( I, J, 1:NSEF ) = & AVGEMIS( I, J, 1:NSEF , NWINTER ) .........
was modified to (~ line 1048 in updated tmpbeis4.f. Note how I and J loop was switched, and also a reminder that Fortran is column-major):
DO J = 1, NROWS
DO I = 1, NCOLS
IF( SWITCH( I,J ) == 0 ) THEN
SEMIS( I, J, 1:NSEF ) =
& AVGEMIS( I, J, 1:NSEF , NWINTER )
.........
3. Check for failure when getting environment-variable (e.g., ENVINT). For example, the following check was added for getting environment-variable 'OUTZONE' (line 247 in original tmpbeis4.f)
TZONE = ENVINT( 'OUTZONE', 'Output time zone', 0, IOS )
IF ( IOS .GT. 0 ) THEN
CALL M3EXIT( PROGNAME,0,0, 'Bad env vble "OUTZONE"', 2 )
END IF
4. Introduction of `USE M3UTILIO` statement in place of using INCLUDE IOAPI's include file (e.g., PARMS3.EXT, FDESC3.EXT, IODECL3.EXT) which would simplify downstream variable declarations and cross-module dependency.
5. Carlie also added a code block for unit conversion from mole/hr to mole/s (~ lines 1001 - 1010 in updated tmpbeis4.f). This could be a typo since this unit conversion was taken care of elsewhere in later section of tmpbeis4. Furthermore, it is more efficient to just make `MLFAC = MLFAC * HR2SEC` rather than putting `MLFAC` in double loops.
C............ Convert to moles/second if necessary
IF ( UNITTYPE .EQ. 2 ) THEN
DO L = 1, MSPCS
DO K = 1, NSEF
MLFAC( L, K ) = HR2SEC * MLFAC( L, K )
END DO
END DO
END IF
Testing of tmpbeis4 with and without update, surprisingly, did not show improvement in the execution time. Note that the test was conducted on a SMOKE training package over LISTOS domain (25 row x 25 col). Observable improvement in execution time could be expected for larger domain.
Using m3diff tool to compare emis_mole* output files initially showed significantly lower emissions in the output files with updated tmpbeis4. This was later found to be caused by the double unit conversion in the updated tmpbeis4 (item 5 in comment above). After this double unit conversion was removed, differences between the outputs are < 0.1% which are in acceptable range.
Huy, can you consider running this on the full 12US2 or 12US1 domain instead of the 25x25?
On Thu, Jan 11, 2024 at 9:20 AM Huy Tran @.***> wrote:
Testing of tmpbeis4 with and without update, surprisingly, did not show improvement in the execution time. Note that the test was conducted on a SMOKE training package over LISTOS domain (25 row x 25 col). Observable improvement in execution time could be expected for larger domain.
Using m3diff tool to compare emis_mole* output files initially showed significantly lower emissions in the output files with updated tmpbeis4. This was later found to be caused by the double unit conversion in the updated tmpbeis4 (item 5 in comment above). After this double unit conversion was removed, differences between the outputs are < 0.1% which are in acceptable range.
— Reply to this email directly, view it on GitHub https://github.com/CEMPD/SMOKE/issues/84#issuecomment-1887278159, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB26PS4FYKV7YLN3BAGWMXDYN7YJLAVCNFSM6AAAAAA7E6XCOGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBXGI3TQMJVHE . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Huy, can you consider running this on the full 12US2 or 12US1 domain instead of the 25x25? …
I'm working on setting up test case based on emission platform 2020ha2 for 12US1 domain. Currently having issue with missing variable SOILT2 in the input met file METCRO2D.
Performance Test with 2020ha2_cb6_20k emission model platform
Results:
Scenarios | Total Run Time | Individual Day Run TIME
FULL | |
1st try | 6:18.69 min | Jul-01: 7 s ; Jul-15: 5 s ; Jul-31: 5 s
2nd try | 6:13.15 min | Jul-01: 5 s ; Jul-15: 5 s ; Jul-31: 5 s
SIMP | |
1st try | 6:15.54 min | Jul-01: 5 s ; Jul-15: 6 s ; Jul-31: 5 s
2nd try | 6:18.04 min | Jul-01: 5 s ; Jul-15: 6 s ; Jul-31: 5 s
ORIG | |
1st try | 8:01.58 min | Jul-01: 10 s ; Jul-15: 9 s ; Jul-31: 7 s
2nd try | 7:30.14 min | Jul-01: 7 s ; Jul-15: 8 s ; Jul-31: 8 s
There is no significant differences in run time between FULL and SIMP, meaning all gained benefit in run time was mainly from the loop re-arrangement. Loop re-arrangement yield about 35% faster in runtime in comparison to ORIG.
Additional information: Modern compiler can transform the code for better efficiency in memory accessing when optimization flag is activated more info here such as -O3
flag which was activated for SMOKE compilation.
Based on October 29, 2023 email from Carlie Coats
Draft code (has not been compiled or tested). Switch to using M3UTILIO module in Tmpbeis4. Change grid-and-species loop nests. Check for failure after environment variable calls (e.g. ENVINT).
"tmpbeis4.0.f" is the un-changed reference version "tmpbeis4.1.f" is the minimal changes-for-M3UTILIO version "tmpbeis4.2.f" with loop-nest orders changed for efficiency "tmpbeis4.f" further revision sent
tmpbeis.zip