Closed sdeastham closed 4 years ago
Thanks for writing. I think this is because the dryrun code in hcoio_read_std_mod.F90 is a little simple-minded in that it lists the file for the given date as missing.
For example, on the AWS c;loud, we have these years of offline biogenics available:
$ s3ls s3://gcgrid/HEMCO/OFFLINE_BIOVOC/v2019-10/0.5x0.625/
PRE 2014/
PRE 2015/
PRE 2016/
PRE 2017/
So there are offline BIOVOC files from 2014-2017 available. But the HEMCO code where we look for missing data files is in this permalink:
As you can see, at line 408, we exit the routine without going deeper into the HEMCO code to figure out what the closest date would be.
Perhaps this could be a feature request for the new HEMCO that will go into dev/13.0.0. Right now for some of these edge cases the dry-run might need to be augmented by manual download.
Or perhaps this is an issue that can be solved in the download_data.py script.
Hi @yantosca - I think I am having the same issue. I was wondering why my run kept dying with an Invalid time index error for the WMO_2018 surface VMR files. Eventually I realised that my 2015 run was trying to use the 2008 file we already had instead of telling me that the 2015 one was missing.
It would be good to fix this, because with the dry run (which is great!), it means someone in the group might be doing a 2015 run so the download script will download just the 2015 files that they need. But then if someone tries to run a different year, the dry run won't report any files they need to download.
Obviously some of them (like WMO_2018) cause the run to die (I'm not quite sure why, as the "C" flag is set). But I am now nervous that there might be others where the run doesn't die, and the model is ticking along using the wrong year's data without us being aware. Is there a way we could test for that?
Thanks, Jenny
I am having the same issue with met files. I set up a dry run to get MERRA2 files from 20190701 to 20200501. Running download_data.py downloaded met data through 20190828 and stopped (I presume because wget failed for some reason). All my subsequent attempts to generate a dryrun log file with the correct remaining files failed.
Hi all. I replicated Prasad's dry run on the AWS cloud. I am finding file paths such as:
/home/ubuntu/ExtData/GEOS_4x5/MERRA2/0001/01/MERRA2.00010101.A1.4x5.nc4
/home/ubuntu/ExtData/GEOS_4x5/MERRA2/0001/01/MERRA2.00010101.A3cld.4x5.nc4
/home/ubuntu/ExtData/GEOS_4x5/MERRA2/0001/01/MERRA2.00010101.A3dyn.4x5.nc4
/home/ubuntu/ExtData/GEOS_4x5/MERRA2/0001/01/MERRA2.00010101.A3mstC.4x5.nc4
/home/ubuntu/ExtData/GEOS_4x5/MERRA2/0001/01/MERRA2.00010101.A3mstE.4x5.nc4
/home/ubuntu/ExtData/GEOS_4x5/MERRA2/0001/01/MERRA2.00010102.I3.4x5.nc4
which are obviously not correct. I am looking into if any modifications since 12.7.0 (when the dry-run was introduced) can be causing this. Stay tuned.
This may be obvious, but thought I would post . I modified the input.geos and HEMCO_Config.rc files to point to a non-existent data directory and reran the dry run - in this case the correct dryrun log file was generated. So it is the existence of files (even if they are not the correct ones) that seems to be creating problems in the dry run.
So based on my previous comment, I came up with a crude fix that seems to work. I create my dry run logfile by modifying my input.geos and HEMCO_Config.rc files to point to a non-existent root data directory (instead of /work/psk9/Data/ExtData, I specify it as /work/psk9x/Data/ExtData) and generate the log file with the correct listing of files. Then I edit input.geos and HEMCO_Config.rc to specify the correct root data directory and run download_data.py - this only downloads missing files, though it is a bit slow because it checks for all the files and prints the message saying it is not retrieving files that already exist.
Hi all, I think I have found the problem. I am not sure if this is a side-effect of recent modifications to HEMCO, or if it always was this way. I put in some debug print to subroutine SrcFile_Parse in hcoio_mod.F90:
!=================================================================
! SrcFile_Parse
!=================================================================
! Initialize to input string
srcFile = Lct%Dct%Dta%ncFile
IF ( INDEX( Lct%Dct%Dta%ncFile, '$METDIR' ) > 0 ) THEN
print*, '@@@ in sfp 0: ', TRIM(srcFile)
ENDIF
! verbose mode
IF ( HCO_IsVerb(HcoState%Config%Err,3) ) THEN
WRITE(MSG,*) 'Parsing source file and replacing tokens'
CALL HCO_MSG(HcoState%Config%Err,MSG)
ENDIF
! Get preferred dates (to be passed to parser)
CALL HCO_GetPrefTimeAttr ( HcoState, Lct, &
prefYr, prefMt, prefDy, prefHr, prefMn, RC )
IF ( RC /= HCO_SUCCESS ) RETURN
! Make sure dates are not negative
IF ( prefYr <= 0 ) THEN
CALL HcoClock_Get( HcoState%Clock, cYYYY = prefYr, RC = RC )
IF ( RC /= HCO_SUCCESS ) RETURN
ENDIF
IF ( prefMt <= 0 ) THEN
CALL HcoClock_Get( HcoState%Clock, cMM = prefMt, RC = RC )
IF ( RC /= HCO_SUCCESS ) RETURN
ENDIF
IF ( prefDy <= 0 ) THEN
CALL HcoClock_Get( HcoState%Clock, cDD = prefDy, RC = RC )
IF ( RC /= HCO_SUCCESS ) RETURN
ENDIF
IF ( prefHr < 0 ) THEN
CALL HcoClock_Get( HcoState%Clock, cH = prefHr, RC = RC )
IF ( RC /= HCO_SUCCESS ) RETURN
ENDIF
! Eventually replace default preferred year with specified one
IF ( PRESENT(Year) ) prefYr = Year
! Call the parser
CALL HCO_CharParse ( HcoState%Config, srcFile, prefYr, prefMt, prefDy, prefHr, prefMn, RC )
IF ( RC /= HCO_SUCCESS ) RETURN
srcFileOrig = TRIM(srcFile)
IF ( INDEX( Lct%Dct%Dta%ncFile, '$METDIR' ) > 0 ) THEN
print*, '@@@ in sfp 1: ', TRIM(srcFile)
ENDIF
! Check if file exists
INQUIRE( FILE=TRIM(srcFile), EXIST=HasFile )
IF ( INDEX( Lct%Dct%Dta%ncFile, '$METDIR' ) > 0 ) THEN
print*, '@@@ in sfp 2: ', Hasfile
ENDIF
! If the direction flag is on, force HasFile to be false.
IF ( PRESENT(Direction) ) THEN
IF ( Direction /= 0 ) HasFile = .FALSE.
ENDIF
! If file does not exist, check if we can adjust prefYr, prefMt, etc.
IF ( .NOT. HasFile .AND. Lct%Dct%DctType /= HCO_CFLAG_EXACT ) THEN
! Check if any token exist
HasYr = ( INDEX(TRIM(Lct%Dct%Dta%ncFile),'YYYY') > 0 )
HasMt = ( INDEX(TRIM(Lct%Dct%Dta%ncFile),'MM' ) > 0 )
HasDy = ( INDEX(TRIM(Lct%Dct%Dta%ncFile),'DD' ) > 0 )
HasHr = ( INDEX(TRIM(Lct%Dct%Dta%ncFile),'HH' ) > 0 )
! Search for file
IF ( HasYr .OR. HasMt .OR. HasDy .OR. HasHr ) THEN
! Date increments
INC = -1
IF ( PRESENT(Direction) ) THEN
INC = Direction
ENDIF
! Initialize counters
CNT = 0
! Type is the update type (see below)
TYP = 0
! Mirror preferred variables
origYr = prefYr
origMt = prefMt
origDy = prefDy
origHr = prefHr
! Do until file is found or counter exceeds threshold
DO WHILE ( .NOT. HasFile )
! Inrease counter
CNT = CNT + 1
IF ( CNT > MAXIT ) EXIT
! Increase update type if needed:
nextTyp = .FALSE.
! Type 0: Initialization
IF ( TYP == 0 ) THEN
nextTyp = .TRUE.
! Type 1: update hour only
ELSEIF ( TYP == 1 .AND. TYPCNT > 24 ) THEN
nextTyp = .TRUE.
! Type 2: update day only
ELSEIF ( TYP == 2 .AND. TYPCNT > 31 ) THEN
nextTyp = .TRUE.
! Type 3: update month only
ELSEIF ( TYP == 3 .AND. TYPCNT > 12 ) THEN
nextTyp = .TRUE.
! Type 4: update year only
ELSEIF ( TYP == 4 .AND. TYPCNT > 300 ) THEN
nextTyp = .TRUE.
! Type 5: update hour and day
ELSEIF ( TYP == 5 .AND. TYPCNT > 744 ) THEN
nextTyp = .TRUE.
! Type 6: update day and month
ELSEIF ( TYP == 6 .AND. TYPCNT > 372 ) THEN
nextTyp = .TRUE.
! Type 7: update month and year
ELSEIF ( TYP == 7 .AND. TYPCNT > 3600 ) THEN
EXIT
ENDIF
! Get next type
IF ( nextTyp ) THEN
NEWTYP = -1
IF ( hasHr .AND. TYP < 1 ) THEN
NEWTYP = 1
ELSEIF ( hasDy .AND. TYP < 2 ) THEN
NEWTYP = 2
ELSEIF ( hasMt .AND. TYP < 3 ) THEN
NEWTYP = 3
ELSEIF ( hasYr .AND. TYP < 4 ) THEN
NEWTYP = 4
ELSEIF ( hasDy .AND. TYP < 2 ) THEN
NEWTYP = 5
ELSEIF ( hasDy .AND. TYP < 2 ) THEN
NEWTYP = 6
ELSEIF ( hasDy .AND. TYP < 2 ) THEN
NEWTYP = 7
ENDIF
! Exit if no other type found
IF ( NEWTYP < 0 ) EXIT
! This is the new type, reset type counter
TYP = NEWTYP
TYPCNT = 0
! Make sure we reset all values
prefYr = origYr
prefMt = origMt
prefDy = origDy
prefHr = origHr
ENDIF
! Update preferred datetimes
SELECT CASE ( TYP )
! Adjust hour only
CASE ( 1 )
prefHr = prefHr + INC
! Adjust day only
CASE ( 2 )
prefDy = prefDy + INC
! Adjust month only
CASE ( 3 )
prefMt = prefMt + INC
! Adjust year only
CASE ( 4 )
prefYr = prefYr + INC
! Adjust hour and day
CASE ( 5 )
prefHr = prefHr + INC
IF ( MOD(TYPCNT,24) == 0 ) prefDy = prefDy + INC
! Adjust day and month
CASE ( 6 )
prefDy = prefDy + INC
IF ( MOD(TYPCNT,31) == 0 ) prefMt = prefMt + INC
! Adjust month and year
CASE ( 7 )
prefMt = prefMt + INC
IF ( MOD(TYPCNT,12) == 0 ) prefYr = prefYr + INC
CASE DEFAULT
EXIT
END SELECT
! Check if we need to adjust a year/month/day/hour
IF ( prefHr < 0 ) THEN
prefHr = 23
prefDy = prefDy - 1
ENDIF
IF ( prefHr > 23 ) THEN
prefHr = 0
prefDy = prefDy + 1
ENDIF
IF ( prefDy < 1 ) THEN
prefDy = 31
prefMt = prefMt - 1
ENDIF
IF ( prefDy > 31 ) THEN
prefDy = 1
prefMt = prefMt + 1
ENDIF
IF ( prefMt < 1 ) THEN
prefMt = 12
prefYr = prefYr - 1
ENDIF
IF ( prefMt > 12 ) THEN
prefMt = 1
prefYr = prefYr + 1
ENDIF
! Make sure day does not exceed max. number of days in this month
prefDy = MIN( prefDy, Get_LastDayOfMonth( prefMt, prefYr ) )
! Mirror original file
srcFile = Lct%Dct%Dta%ncFile
! Call the parser with adjusted values
CALL HCO_CharParse ( HcoState%Config, srcFile, prefYr, prefMt, prefDy, prefHr, prefMn, RC )
IF ( RC /= HCO_SUCCESS ) RETURN
! Check if this file exists
INQUIRE( FILE=TRIM(srcFile), EXIST=HasFile )
IF ( INDEX( Lct%Dct%Dta%ncFile, '$METDIR' ) > 0 ) THEN
print*, '@@@ in sfp 2a: ', TRIM(srcFile)
print*, '@@@ in sfp 2b: ', Hasfile
ENDIF
! Update counter
TYPCNT = TYPCNT + 1
ENDDO
ENDIF
ENDIF
! Additional check for data with a given range: make sure that the selected
! field is not outside of the given range
IF ( HasFile .AND. ( Lct%Dct%Dta%CycleFlag == HCO_CFLAG_RANGE ) ) THEN
HasFile = TIDX_IsInRange ( Lct, prefYr, prefMt, prefDy, prefHr )
ENDIF
! Restore original source file name and date to avoid confusion in log file
IF ( .not. HasFile ) THEN
srcFile = Trim(srcFileOrig)
ENDIF
! Return variable
FOUND = HasFile
! Return w/ success
RC = HCO_SUCCESS
IF ( INDEX( Lct%Dct%Dta%ncFile, '$METDIR' ) > 0 ) THEN
print*, '@@@ in sfp 3: ', Hasfile
ENDIF
and then I did a dry run for 2019/07/01 to 2019/08/01. All good, as Prasad says. Then I did another dry-run for 2019/08/01 to 2019/09/01 with the debug printout enabled. Here is a snippet of what I found:
@@@ in sfp 0: $METDIR/$YYYY/$MM/$MET.$YYYY$MM$DD.A1.$RES.$NC
@@@ in sfp 1: /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/08/MERRA2.20190802.A1.4x5.nc4
@@@ in sfp 2: F
@@@ in sfp 2a: /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/08/MERRA2.20190801.A1.4x5.nc4
@@@ in sfp 2b: T
@@@ in sfp 3: T
...
@@@ in sfp 0: $METDIR/$YYYY/$MM/$MET.$YYYY$MM$DD.A1.$RES.$NC
@@@ in sfp 1: /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/08/MERRA2.20190803.A1.4x5.nc4
@@@ in sfp 2: F
@@@ in sfp 2a: /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/08/MERRA2.20190802.A1.4x5.nc4
@@@ in sfp 2b: F
@@@ in sfp 2a: /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/08/MERRA2.20190801.A1.4x5.nc4
@@@ in sfp 2b: T
@@@ in sfp 3: T
... and further down ...
@@@ in sfp 0: $METDIR/$YYYY/$MM/$MET.$YYYY$MM$DD.A1.$RES.$NC
@@@ in sfp 1: /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/08/MERRA2.20190815.A1.4x5.nc4
@@@ in sfp 2: F
@@@ in sfp 2a: /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/08/MERRA2.20190814.A1.4x5.nc4
@@@ in sfp 2b: F
@@@ in sfp 2a: /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/08/MERRA2.20190813.A1.4x5.nc4
@@@ in sfp 2b: F
@@@ in sfp 2a: /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/08/MERRA2.20190812.A1.4x5.nc4
@@@ in sfp 2b: F
@@@ in sfp 2a: /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/08/MERRA2.20190811.A1.4x5.nc4
@@@ in sfp 2b: F
@@@ in sfp 2a: /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/08/MERRA2.20190810.A1.4x5.nc4
@@@ in sfp 2b: F
@@@ in sfp 2a: /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/08/MERRA2.20190809.A1.4x5.nc4
@@@ in sfp 2b: F
@@@ in sfp 2a: /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/08/MERRA2.20190808.A1.4x5.nc4
@@@ in sfp 2b: F
@@@ in sfp 2a: /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/08/MERRA2.20190807.A1.4x5.nc4
@@@ in sfp 2b: F
@@@ in sfp 2a: /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/08/MERRA2.20190806.A1.4x5.nc4
@@@ in sfp 2b: F
@@@ in sfp 2a: /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/08/MERRA2.20190805.A1.4x5.nc4
@@@ in sfp 2b: F
@@@ in sfp 2a: /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/08/MERRA2.20190804.A1.4x5.nc4
@@@ in sfp 2b: F
@@@ in sfp 2a: /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/08/MERRA2.20190803.A1.4x5.nc4
@@@ in sfp 2b: F
@@@ in sfp 2a: /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/08/MERRA2.20190802.A1.4x5.nc4
@@@ in sfp 2b: F
@@@ in sfp 2a: /home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/08/MERRA2.20190801.A1.4x5.nc4
@@@ in sfp 2b: T
@@@ in sfp 3: T
So the routine SrcFile_Parse seems to keep wanting to go back in time if it can't find a file.
I think we need to put a shunt for the dry-run so that we don't enter that time stepping loop. That should fix it.
This should now be fixed by commit https://github.com/geoschem/geos-chem/commit/f6586419c929dcee69790dcdc7499087ef90e076, which for now has been posted in the bugfix/dryrun branch, which is off of the master branch (12.8.2). You can pull this update to your repository. We will add this to 12.9.0. I will also make a pull request for this into the HEMCO repository (which is standalone).
Here are the unique log files from 2019/07/01 (the first month, from a clean ExtData),
and 2019/08/01 (the second month, with files for 2019/07/01 in ExtData):
So I think this fixes it.
This has now been merged into our 12.9.0 development branch. I will close out this issue for now.
Until 12.9.0 is released, you can still take this fix from the bugfix/dryrun branch.
Hi Bob, this fix looks like it is still not working - see attached dryrun.log file:
1) Many of the files in the list of files after TrashBurn_v2_generic.01x01.nc are in fact already downloaded on my system though the dryrun log file say not found. For example: HEMCO: REQUIRED FILE NOT FOUND /work/psk9/Data/ExtData/HEMCO/TrashEmis/v2015-03/TrashBurn_v2_generic.01x01.nc
/work/psk9/Data/ExtData/GEOS_4x5/MERRA2/2019/07% ls -l /work/psk9/Data/ExtData/HEMCO/TrashEmis/v2015-03/TrashBurn_v2_generic.01x01.nc -rw-rw-r--. 1 psk9 root 89227465 Mar 29 2018 /work/psk9/Data/ExtData/HEMCO/TrashEmis/v2015-03/TrashBurn_v2_generic.01x01.nc
2) The file lists seem to mess up when crossing the year boundary - my start date is 20190701 and end date is 20200501. So for example, the A3cld files seem to be listed correctly till Dec 31, but incorrectly after that.
Just updating my previous comment - the problem re #2 in my previous comment seems to be related to 2020 met files, and not with crossing the year boundary. The path names of all the 2020 MERRA2 files, except for the I3 files, are incorrect in the dryrun output file.The I3 files are listed twice for each day, once correctly and once incorrectly. For example, here is a portion of the dryrun output:
HEMCO: REQUIRED FILE NOT FOUND /work/psk9/Data/ExtData/GEOS_4x5/MERRA2/0001/01/MERRA2.00010101.A1.4x5.nc4 HEMCO: REQUIRED FILE NOT FOUND /work/psk9/Data/ExtData/GEOS_4x5/MERRA2/0001/01/MERRA2.00010101.A3cld.4x5.nc4 HEMCO: REQUIRED FILE NOT FOUND /work/psk9/Data/ExtData/GEOS_4x5/MERRA2/0001/01/MERRA2.00010101.A3dyn.4x5.nc4 HEMCO: REQUIRED FILE NOT FOUND /work/psk9/Data/ExtData/GEOS_4x5/MERRA2/0001/01/MERRA2.00010101.A3mstC.4x5.nc4 HEMCO: REQUIRED FILE NOT FOUND /work/psk9/Data/ExtData/GEOS_4x5/MERRA2/0001/01/MERRA2.00010101.A3mstE.4x5.nc4 HEMCO: REQUIRED FILE NOT FOUND /work/psk9/Data/ExtData/GEOS_4x5/MERRA2/2020/01/MERRA2.20200101.I3.4x5.nc4 HEMCO: REQUIRED FILE NOT FOUND /work/psk9/Data/ExtData/GEOS_4x5/MERRA2/0001/01/MERRA2.00010102.I3.4x5.nc4
Hi Prasad, thanks for looking into this again. I believe I have found the error. The files with "0001/01" in their file name in the dryrun output are caused because the year entries for the met fields in HEMCO_Config.rc end in 2019. In other words, change the time info for all of the met fields in HEMCO_Config.rc from e.g.:
* ALBEDO $METDIR/$YYYY/$MM/$MET.$YYYY$MM$DD.A1.$RES.$NC ALBEDO 1980-2019/1-12/1-31/*/+30minute RFY xy 1 * - 1 1
* CLDTOT $METDIR/$YYYY/$MM/$MET.$YYYY$MM$DD.A1.$RES.$NC CLDTOT 1980-2019/1-12/1-31/*/+30minute RFY xy 1 * - 1 1
* EFLUX $METDIR/$YYYY/$MM/$MET.$YYYY$MM$DD.A1.$RES.$NC EFLUX 1980-2019/1-12/1-31/*/+30minute RFY xy 1 * - 1 1
... etc ...
to
* ALBEDO $METDIR/$YYYY/$MM/$MET.$YYYY$MM$DD.A1.$RES.$NC ALBEDO 1980-2020/1-12/1-31/*/+30minute RFY xy 1 * - 1 1
* CLDTOT $METDIR/$YYYY/$MM/$MET.$YYYY$MM$DD.A1.$RES.$NC CLDTOT 1980-2020/1-12/1-31/*/+30minute RFY xy 1 * - 1 1
* EFLUX $METDIR/$YYYY/$MM/$MET.$YYYY$MM$DD.A1.$RES.$NC EFLUX 1980-2020/1-12/1-31/*/+30minute RFY xy 1 * - 1 1
...etc...
Once you do that, you get clean dry-run output with files that do not have 0001/01 in their paths. This output is from a TransportTracers dryrun from 20191231 to 20200102:
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!! LIST OF (UNIQUE) FILES REQUIRED FOR THE SIMULATION
!!! Start Date : 20191231 000000
!!! End Date : 20200102 000000
!!! Simulation : TransportTracers
!!! Meteorology : MERRA2
!!! Grid Resolution : 4.0x5.0
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
./GEOSChem.Restart.20191231_0000z.nc4 --> /home/ubuntu/ExtData/GEOSCHEM_RESTARTS/v2018-11/initial_GEOSChem_rst.4x5_TransportTracers.nc
./HEMCO_Config.rc
./HEMCO_Diagn.rc
./HISTORY.rc
./input.geos
/home/ubuntu/ExtData/CHEM_INPUTS/Olson_Land_Map_201203/Olson_2001_Drydep_Inputs.nc
/home/ubuntu/ExtData/GEOS_4x5/MERRA2/2015/01/MERRA2.20150101.CN.4x5.nc4
/home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/12/MERRA2.20191231.A1.4x5.nc4
/home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/12/MERRA2.20191231.A3cld.4x5.nc4
/home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/12/MERRA2.20191231.A3dyn.4x5.nc4
/home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/12/MERRA2.20191231.A3mstC.4x5.nc4
/home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/12/MERRA2.20191231.A3mstE.4x5.nc4
/home/ubuntu/ExtData/GEOS_4x5/MERRA2/2019/12/MERRA2.20191231.I3.4x5.nc4
/home/ubuntu/ExtData/GEOS_4x5/MERRA2/2020/01/MERRA2.20200101.A1.4x5.nc4
/home/ubuntu/ExtData/GEOS_4x5/MERRA2/2020/01/MERRA2.20200101.A3cld.4x5.nc4
/home/ubuntu/ExtData/GEOS_4x5/MERRA2/2020/01/MERRA2.20200101.A3dyn.4x5.nc4
/home/ubuntu/ExtData/GEOS_4x5/MERRA2/2020/01/MERRA2.20200101.A3mstC.4x5.nc4
/home/ubuntu/ExtData/GEOS_4x5/MERRA2/2020/01/MERRA2.20200101.A3mstE.4x5.nc4
/home/ubuntu/ExtData/GEOS_4x5/MERRA2/2020/01/MERRA2.20200101.I3.4x5.nc4
/home/ubuntu/ExtData/GEOS_4x5/MERRA2/2020/01/MERRA2.20200102.A1.4x5.nc4
/home/ubuntu/ExtData/GEOS_4x5/MERRA2/2020/01/MERRA2.20200102.A3cld.4x5.nc4
/home/ubuntu/ExtData/GEOS_4x5/MERRA2/2020/01/MERRA2.20200102.A3dyn.4x5.nc4
/home/ubuntu/ExtData/GEOS_4x5/MERRA2/2020/01/MERRA2.20200102.A3mstC.4x5.nc4
/home/ubuntu/ExtData/GEOS_4x5/MERRA2/2020/01/MERRA2.20200102.A3mstE.4x5.nc4
/home/ubuntu/ExtData/GEOS_4x5/MERRA2/2020/01/MERRA2.20200102.I3.4x5.nc4
/home/ubuntu/ExtData/GEOS_4x5/MERRA2/2020/01/MERRA2.20200103.I3.4x5.nc4
/home/ubuntu/ExtData/HEMCO/CEDS/v2018-08/2014/CO-em-anthro_CMIP_CEDS_2014.nc
/home/ubuntu/ExtData/HEMCO/OLSON_MAP/v2019-02/Olson_2001_Land_Type_Masks.025x025.generic.nc
/home/ubuntu/ExtData/HEMCO/SF6/v2019-01/EDGAR_v42_SF6_IPCC_2.generic.01x01.nc
/home/ubuntu/ExtData/HEMCO/TIMEZONES/v2015-02/timezones_voronoi_1x1.nc
/home/ubuntu/ExtData/HEMCO/Yuan_XLAI/v2019-03/Yuan_proc_MODIS_XLAI.025x025.2016.nc
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!! LIST OF (UNIQUE) FILES REQUIRED FOR THE SIMULATION
!!! Start Date : 20191231 000000
!!! End Date : 20200102 000000
!!! Simulation : TransportTracers
!!! Meteorology : MERRA2
!!! Grid Resolution : 4.0x5.0
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
We will update the HEMCO_Config.rc files in the unit tester so that 12.9.0 run directories have these.
Also I am going to take a quick look to see why an error doesn't happen if you exceed the year range in HEMCO_Config.rc. If it is a real simulation and not a dry-run then an error is thrown.
Any update on item #1 in my question above - ie why is the dryrun listing files as not found when in fact they are there on the system? Thanks!
Describe the bug
When running the GEOS-Chem dry run, it will usually correctly identify if (for example) OFFLINE_BIOVOC files are missing. However, if there are files present from before the target date (e.g. if I have downloaded files for Y2007 but not Y2008 or Y2009), then the dry run will reuse the earlier files rather than flagging that the target files are missing. If the cycling flag is changed from "C" to "EF" or "E", this still does not resolve the error - no new BIOVOC files are flagged.
To Reproduce
Include the steps that must be done in order to reproduce the observed behavior:
Preparation
LOCAL_BIOVOC
LOCAL_BIOVOC
Run commands
input.geos
to run from 2008-01-01 to 2008-02-01./geos --dryrun > log.dryrun
log.dryrun
You should see that the old OFFLINE_BIOVOC files are being used, rather than the missing target files being flagged.
Expected behavior
The missing files should be flagged in the log.
Required information
Please include the following:
Input and log files to attach
log.dryrun.2008.txt The attached log file shows the run output - it correctly opens the 2008-07-01 file for BIOVOC (which is present on the system), but then starts looping over earlier years. For a clear example see lines 898-901; HEMCO is reading in data for 2008-07-27, but the lines for the four offline emissions read:
In this case, I'm using directories which are softlinked to all the correct ones on the Harvard Cannon cluster, but my own directories for
OFFLINE_BIOVOC
(as the MERRA-2 biogenic VOC emissions aren't all available at Harvard). The key point is that the BIOVOC file is still20080701
, whereas the others are all (correctly)2008-07-27
.Additional context
This is a slightly difficult issue to resolve because sometimes we specifically want to loop earlier data. Perhaps for the dryrun option we should be expecting that fields with the
R
field actually have requirements satisfied?Edit: tagging @msulprizio and @jimmielin as I think you'll both have insights on this particular issue!