USEPA / CMAQ

Code for U.S. EPA’s Community Multiscale Air Quality Model (CMAQ) which helps in conducting air quality model simulations
https://epa.gov/CMAQ
MIT License
277 stars 198 forks source link

DMS and Chlorophyll Notebook issue with Chlorophyll #184

Open barronh opened 1 year ago

barronh commented 1 year ago

Description When running the Jupyter Notebook in PYTOOLS/dmschlo/CMAQ_DMS_ChlorA.ipynb, the notebook fails when finding climatology files.

Scope and Impact

This causes the notebook to fail and the DMS/CHLO variables cannot be created.

Solution

The error is caused by a server reorganization of the files on the server at the OPBG DAAC.

  1. the directory structure has changed from Julian day (%j) of year to month-day (%m%d).
  2. Naming structure also changed
    • old> A%Y%j%Y%j.L3m_MC_CHL_chlor_a_9km.nc
    • new> AQUA_MODIS.%Y%m%d_%Y%m%d.L3m.MC.CHL.chlor_a.9km.nc
  3. The climatology files previously used 2003-08-01 as the start of climatology data, but now they are starting with 2002-07-01.

This requires two changes to the notebook. Both are in the cell that starts with “if getlatestchlo”.

First, change dates in the for loop

<    for prefix in ['2003/0801', '2003/0901', '2003/1001', '2003/1101', '2003/1201', '2004/0101', '2004/0201', '2004/0301', '2004/0401', '2004/0501', '2004/0601', '2004/0701']:
>    for prefix in ['2002/0701', '2002/0801', '2002/0901', '2002/1001', '2002/1101', '2002/1201', '2003/0101', '2003/0201', '2003/0301', '2003/0401', '2003/0501', '2003/0601']:

Second, change the regular expression that finds the files

<        mostrecent = sorted(re.compile('(?<=>).+L3m_MC_CHL_chlor_a_9km.nc(?=</)').findall(htmltxt))[-1]
>        mostrecent = sorted(re.compile('(?<=>).+.L3m.MC.CHL.chlor_a.9km.nc(?=</)').findall(htmltxt))[-1]

I have successfully run for a new domain with the latest climatology files.

Additional context

A PR will be forthcoming that also changes the documentation

This type of update is inherent in including download as part of the process.

  1. We could move download out of the notebook, but the problem of reorganization will continue -- just outside the notebook.
  2. We could avoid this by using the CMR to dynamically query, but the CMR is not always up-to-date.
  3. Open to other proposals.
barronh commented 1 year ago

Open In Colab

FYI: This notebook can be opened on Google Colab easily. Just click the badge above. You'll need to upload your OPEN/SURF file and download your results.

You'll have to make the changes above or it won't find the CHLO files.

barronh commented 1 year ago

p.s., you also need to change the loop and date diagnostic lines in the loop later.

Change from:

for chloutpath in sorted(glob(f'chlor_a/{dom}/A2*_{dom}.nc')):
    mydate = datetime.strptime(os.path.basename(chloutpath)[1:8], '%Y%j')

Change to:

for chloutpath in sorted(glob(f'chlor_a/{dom}/A*.{dom}.nc')):
    mydate = datetime.strptime(os.path.basename(chloutpath)[11:19], '%Y%m%d')