CEMPD / SMOKE

Create emissions inputs for multiple air quality modeling systems with unmatched speed and flexibility
https://www.cmascenter.org/smoke/
45 stars 21 forks source link

SMOKE Array Out of Bound #85

Closed hnqtran closed 8 months ago

hnqtran commented 9 months ago

Several subroutines in SMOKE are found to have array out of bound issue when SMOKE was compiled with gfortran compiler. The issue was also detected if SMOKE was compiled with intel compiler and with flag "-check bounds" activate in Makeinclude file (which is not enabled in the standard distributed package)

List of subroutines that having the issue (will be updated as more of them are found):

Details on the issue in each of the above subroutines are their mitigations are discussed in the below sections.

hnqtran commented 9 months ago

Details on issue with src/emqa/bldrepidx.f (SMKREPORT module)

Array out of bound issue is found for variable OUTDNAME which is first allocated with ALLOCATE( OUTDNAM(MXOUTDAT, NREPORT ), STAT=IOS )

where the 1st dimension MXOUTDAT is the number of species of which names are to be write out in the smoke report table header, and value of MXOUTDAT depends on the report configuration REPCONFIG file.

in the following code block, OUTDNAME is de-allocated and re-allocated when an condition is met: IF( RPT_%BYSPC ) THEN ! BY SPCCODE poll+ add associated species.... NDATA = 1 ! add default firt pol name DO V = 1, NSVARS IF( INDNAM( 1,N ) == POLNAM( V ) ) NDATA = NDATA + 1 END DO C NSPCPOL = NDATA ALLRPT( N )%NUMDATA = NDATA C DEALLOCATE( OUTDNAM, SPCPOL ) DEALLOCATE( OUTDNAM ) ALLOCATE( OUTDNAM( NDATA, NREPORT ), STAT=IOS ) CALL CHECKMEM( IOS, 'OUTDNAM', PROGNAME ) C ALLOCATE( SPCPOL( NDATA ), STAT=IOS ) C CALL CHECKMEM( IOS, 'SPCPOL', PROGNAME )

Taking this REPCONFIG with the following content as example: _/NEWFILE/ REPORT1

/CREATE REPORT/ BY STATE NAME SPECIATION MASS NUMBER E15.8 UNITS ALL tons/yr /END/

/NEWFILE/ REPORT2

/CREATE REPORT/ BY STATE NAME BY SCC10 NAME SPECIATION MASS NUMBER E13.5 UNITS ALL tons/yr /END/

/NEWFILE/ REPORT3

/CREATE REPORT/ BY STATE NAME BY SCC10 NAME BY SPCCODE PM25 SPECIATION MASS NUMBER E13.5 UNITS ALL tons/yr /END/

For REPORT1 and REPORT2, SPECIATION MASS is requested and MXOUTDAT = 77 (77 chemical species). As SMKREPORT process to REPORT3, the RPT_%BYSPC condition is met and OUTDNAM is re-allocated to ALLOCATE( OUTDNAM( NDATA, NREPORT ), STAT=IOS ) where NDATA = 19 in this example.

In subsequence subroutine WRREPHDR (src/emqa/wrrephdr.f), array out-of-bound occurs causing segmentation fault when OUTDNAM is accessed beyond its 1st allocated dimension ' DO J = STIDX, EDIDX' IF( RPT_%RPTMODE .EQ. 3 ) THEN L2 = LEN_TRIM( HEADERS( IHDRDATA ) ) W1 = MAX( NLEFT, W1, L2 ) ELSE L2 = LEN_TRIM( OUTDNAM( J,RCNT ) ) W1 = MAX( NLEFT, W1, L2, LN ) END IF

WRREPHDR does it own thing to figure out STIDX = 1 and EDIDX = 77 in this example, and as J increases above 19, array out-of-bound occurs.

If SMOKE was compiled with gfortran or with ifort '-check bounds', segmentation fault occured and no smoke report file is created. If SMOKE was compiled with ifort without check bounds, no error occured and smoke report files are created but with binary junks in the header of report table. (See attached example report file rep_rwc_2018gg_18j_inv_county.txt; Note how the table header has multiple invalid binary values and only show aerosol species whereas other gaseous species are also expected).

The issue with OUTDNAM is found to be only affect the header of smoke report table; values of the table are handled by different subroutine and therefore is not affected by this issue.

Mitigation approach: Allow large enough allocation of OUTDNAM and remove lines where OUTDNAME is de-allocated and re-allocated ALLOCATE( OUTDNAM( max(MXOUTDAT,NSVARS+1), NREPORT ), STAT=IOS ) (See attached example report file rep_rwc_2018gg_18j_test_inv_county.txt)

Fixed in commit https://github.com/CEMPD/SMOKE/commit/5bb27fd67a3f464cd77b9cf9854d6496be801615

hnqtran commented 9 months ago

Details on issues with src/smkinven/wrpdemis.f (more details to be added)

forrtl: severe (408): fort: (3): Subscript #1 of the array UCASNKEP has value 0 which is less than the lower bound of 1

Image              PC                Routine            Line        Source
smkinven           00000000004DD153  wrpdemis_                 284  wrpdemis.f
smkinven           000000000041EB22  genpdout_                 302  genpdout.f
smkinven           00000000004AD513  MAIN__                    502  smkinven.f
smkinven           00000000004072DD  Unknown               Unknown  Unknown
libc-2.28.so       0000152D846CDD85  __libc_start_main     Unknown  Unknown
smkinven           00000000004071FE  Unknown               Unknown  Unknown
31.397u 0.353s 0:32.17 98.6%    0+0k 464+104328io 4pf+0w

Code line where issue occurs

This issue seems only applicable to ptegu when processing the daily emission processing script (after the onetime processing script was executed).

hnqtran commented 9 months ago

Details on issues with src/emqa/rdssup.f

forrtl: severe (408): fort: (2): Subscript #1 of the array GSPROID has value 22 which is greater than the upper bound of 21

Image              PC                Routine            Line        Source
smkreport          0000000000490953  rdssup_                   219  rdssup.f
smkreport          00000000004555F9  rdrepin_                  708  rdrepin.f
smkreport          0000000000471F7E  MAIN__                    201  smkreport.f
smkreport          000000000040731D  Unknown               Unknown  Unknown
libc-2.28.so       000014D7F8E16D85  __libc_start_main     Unknown  Unknown
smkreport          000000000040723E  Unknown               Unknown  Unknown

Code line where issue occurs

Fixed in commit https://github.com/CEMPD/SMOKE/commit/5bb27fd67a3f464cd77b9cf9854d6496be801615

hnqtran commented 8 months ago

Details on issues with src/smkinven/wrpdemis.f (more details to be added)

forrtl: severe (408): fort: (3): Subscript #1 of the array UCASNKEP has value 0 which is less than the lower bound of 1

Image              PC                Routine            Line        Source
smkinven           00000000004DD153  wrpdemis_                 284  wrpdemis.f
smkinven           000000000041EB22  genpdout_                 302  genpdout.f
smkinven           00000000004AD513  MAIN__                    502  smkinven.f
smkinven           00000000004072DD  Unknown               Unknown  Unknown
libc-2.28.so       0000152D846CDD85  __libc_start_main     Unknown  Unknown
smkinven           00000000004071FE  Unknown               Unknown  Unknown
31.397u 0.353s 0:32.17 98.6%    0+0k 464+104328io 4pf+0w

Code line where issue occurs

This issue seems only applicable to ptegu when processing the daily emission processing script (after the onetime processing script was executed).

Three issues were found in subroutine WRPDEMIS that caused array out of bound errors. Two issues were fixed by rearrange variable values when processing CEM data. The 3rd issue was related to treatment of special data variable when processing CEM data: variable CODEA record special data 9006 which represents FLOWPOS (position of flow rate in CEM data format). CODEA variable carries indexes to inventory pollutant names (EANAM), and when value 9006 was called as index of EANAM, out-of-bound error occurred.

Code lines 271 - 316 are currently not well constructed with many value re-assignments of V and POLNAM variables when processing CEMS.