barronh / cmaqsatproc

Satellite Processors Designed for CMAQ
GNU General Public License v3.0
6 stars 4 forks source link

Averaging Kernels for OMPROFOZ #2

Closed chogrefe closed 5 months ago

chogrefe commented 4 years ago

Trying to use an averaging kernel from OMPROFOZ L2 files processed with sat2cmaq when calculating tropospheric ozone columns from H-CMAQ using cmaq2sat, I encounter the following error, suggesting that the use of averaging kernels may not be implemented yet for this satellite product:

atmos1:/work/MOD3EVAL/css/CMAQv53_TS/work/satellite/OMPROFOZ> convertcmaq.sh outcmaq/CCTM_CONC_v532_intel18.0_CMAQv53_TS_20090101.O3.nc Traceback (most recent call last): File "scripts/cmaq2sat.py", line 160, in ppm2du(cpath, m2path, m3path, outpath, key=key, akpath=akpath, akkey=akkey) File "scripts/cmaq2sat.py", line 64, in ppm2du akf = pnc.pncopen(akpath, format='ioapi').eval( File "/work/ROMO/anaconda_envs/basic38/lib/python3.8/site-packages/PseudoNetCDF/cmaqfiles/_ioapi.py", line 561, in eval out = PseudoNetCDFFile.eval(self, *args, **kwds) File "/work/ROMO/anaconda_envs/basic38/lib/python3.8/site-packages/PseudoNetCDF/core/_files.py", line 1013, in eval outf.createVariable(key, val.dtype.char, File "/work/ROMO/anaconda_envs/basic38/lib/python3.8/site-packages/PseudoNetCDF/cmaqfiles/_ioapi.py", line 101, in createVariable out = PseudoNetCDFFile.createVariable( File "/work/ROMO/anaconda_envs/basic38/lib/python3.8/site-packages/PseudoNetCDF/core/_files.py", line 2299, in createVariable var = self.variables[name] = PseudoNetCDFMaskedVariable( File "/work/ROMO/anaconda_envs/basic38/lib/python3.8/site-packages/PseudoNetCDF/core/_variables.py", line 306, in new result = result.view(subtype) File "netCDF4/_netCDF4.pyx", line 4360, in netCDF4._netCDF4.Variable.getattr File "netCDF4/_netCDF4.pyx", line 4166, in netCDF4._netCDF4.Variable.getncattr File "netCDF4/_netCDF4.pyx", line 1407, in netCDF4._netCDF4._get_att File "netCDF4/_netCDF4.pyx", line 1887, in netCDF4._netCDF4._ensure_nc_success AttributeError: NetCDF: Attribute not found

Is adding this feature something under consideration for future development?

Related to this question, the current application of averaging kernels to CMAQ data for other products like OMI NO2 seems to rely on estimating CMAQ pressure from surface pressure, VGLVLS, and PTOP obtained from MCIP files. In recent model applications, WRF uses a hybrid vertical coordinate system and edge pressure can no longer be derived from surface pressure and the I/O API VGLVLS and PTOP attributes. hyai and hybi attributes currently are not part of MCIP files obtained from WRF fields using hybrid vertical coordinates. Would future development consider an option to read in MCIP 3D pressure fields when applying averaging kernels in such situations or would this be too much effort for too little gain?

barronh commented 4 years ago

OMPROFOZ should now be capable of outputting required data for averaging kernel processing. Please test.

chogrefe commented 3 years ago

Tested the following after changing "AveragingKernel = {}'.format(akkey)" to "AveragingKernel = ({})[:]'.format(akkey)" in line 54 of cmaq2sat_ppb.py (as as cmaq2sat.py except for ppm conversion) as suggested during our Teams call:

python /work/MOD3EVAL/css/CMAQv53_TS/work/satellite/github_cmaqsatproc/cmaqsatproc_20201105_repo/scripts/cmaq2sat_ppb.py \ /work/MOD3EVAL/css/tmp/COMBINE_CONC3D_v532_intel18.0_CMAQv53_TS_108NHEMI_20090101_O3.nc \ /work/MOD3EVAL/css/tmp/METCRO2D_20090101.nc4 \ /work/MOD3EVAL/css/tmp/COMBINE_CONC3D_v532_intel18.0_CMAQv53_TS_108NHEMI_20090101_PV.nc \ /work/MOD3EVAL/css/CMAQv53_TS/work/satellite/OMPROFOZ/outcmaq/108NHEMI2/daily_files_2009/CCTM_CONC_v532_intel18.0_CMAQv53_TS_108NHEMI_20090101.O3.nc \ O3 \ /work/MOD3EVAL/css/CMAQv53_TS/work/satellite/OMPROFOZ/regrid/108NHEMI2/daily_files/OMI-Aura_L2-OMPROFOZ_2009m0101_v003.nc \ AvgKernel

This failed with the following error message:

/work/ROMO/anaconda_envs/basic38/lib/python3.8/site-packages/PseudoNetCDF/cmaqfiles/_ioapi.py:200: UserWarning: WARNING: missing_value not used since it cannot be safely cast to variable data type outf.variables[vk][:] = self.variables[vk][:] none:1: UserWarning: WARNING: missing_value not used since it cannot be safely cast to variable data type Traceback (most recent call last): File "/work/MOD3EVAL/css/CMAQv53_TS/work/satellite/github_cmaqsatproc/cmaqsatproc_20201105_repo/scripts/cmaq2sat_ppb.py", line 153, in ppm2du( File "/work/MOD3EVAL/css/CMAQv53_TS/work/satellite/github_cmaqsatproc/cmaqsatproc_20201105_repo/scripts/cmaq2sat_ppb.py", line 53, in ppm2du akf = pnc.pncopen(akpath, format='ioapi').eval( File "/work/ROMO/anaconda_envs/basic38/lib/python3.8/site-packages/PseudoNetCDF/cmaqfiles/_ioapi.py", line 361, in interpSigma outf = self.applyAlongDimensions(LAY=interpsigma, verbose=verbose) File "/work/ROMO/anaconda_envs/basic38/lib/python3.8/site-packages/PseudoNetCDF/cmaqfiles/_ioapi.py", line 534, in applyAlongDimensions outf = PseudoNetCDFFile.applyAlongDimensions(self, *args, *kwds) File "/work/ROMO/anaconda_envs/basic38/lib/python3.8/site-packages/PseudoNetCDF/core/_files.py", line 1572, in applyAlongDimensions newdl = df(dvar[:]).size File "/work/ROMO/anaconda_envs/basic38/lib/python3.8/site-packages/PseudoNetCDF/cmaqfiles/_ioapi.py", line 335, in interpsigma newdata = (weights data[:, None]).sum(0) ValueError: operands could not be broadcast together with shapes (25,44) (24,1)

barronh commented 3 years ago

Okay. I see the problem... Now, I just have to figure out why it happened. The OMPROFOZ file has 24 layers, so it should have 25 VGLVLS, but it has 26. As a result, the centered VGLVLS have 25 instead of 24 layers.

chogrefe commented 3 years ago

I am not sure if this is related or not, but I'm also running into issues using the new code to process NO2 tropospheric columns for the EQUATES 2009 - 2017 H-CMAQ runs and applying an averaging kernel. Differently from doing this for O3 columns, the code does not crash, but the resulting CMAQ NO2 column densities are significantly lower (1-2 orders of magnitude) compared to the processing with the previous code (and consequently much lower than the OMI tropospheric NO2 columns).

CMAQ processed with old code (i.e. the code prior to the November 2020 updates) for both sat2cmaq to generate the daily L3 files and cmaq2sat to create CMAQ columns:

image

CMAQ processed with new code (i.e. the code using the November 2020 updates) for both sat2cmaq to generate the daily L3 files and cmaq2sat to create CMAQ columns:

image

2010 CMAQ files processed with old code (Jan - April 2010 only):

/work/MOD3EVAL/css/EQUATES/work/satellite/OMNO2d/outcmaq/108NHEMI2/daily_files_2010_old

2010 CMAQ files processed with new code (all days):

/work/MOD3EVAL/css/EQUATES/work/satellite/OMNO2d/outcmaq/108NHEMI2/daily_files_2010

The changes appear to be caused by some changes in cmaq2sat, the daily files from sat2cmaq appear to be the same between the two code versions, at least for VCD, scattering weights, and air mass factor (aside from different names used for these variables in the latest version of the code:

2010 OMI NO2 files processed with old code (Jan - April 2010 only):

/work/MOD3EVAL/css/EQUATES/work/satellite/OMNO2d/regrid/108NHEMI2/daily_files_old

2010 OMI NO2 files processed with new code (all days):

/work/MOD3EVAL/css/EQUATES/work/satellite/OMNO2d/regrid/108NHEMI2/daily_files

Scripts for CMAQ processing (include the ppb to ppm conversion needed for this set of runs since archived CMAQ 3D files are in ppb):

Old code:

/work/MOD3EVAL/css/EQUATES/work/satellite/scripts/run_convertcmaq_daily_108NHEMI2_2010_oldcode_crosscheck.csh

New code: (processes both O3 and NO2):

/work/MOD3EVAL/css/EQUATES/work/satellite/scripts/run_convertcmaq_omprofoz_omno2d_daily_108NHEMI2_2010.csh

chogrefe commented 3 years ago

Well, I have to correct my post above: What I presented as "old code" actually did not have the averaging kernel applied, due to a missing "\" at the end of the python argument list in my "old code" csh script that caused the AKPATH and AKKEY to not be passed to the code. So the "old code" results above were processed without any averaging kernel, and they are much higher than those processed when applying the averaging kernel (old or new code).

When trying to track this down, and before finally finding the problem in the csh script, I started writing out sample values of akf and ak in the cmaq2sat.py script. Here is what they look like for a sample grid cell across the 44 layers:

akf float ScatWgt (TSTEP, LAY, ROW, COL); // shape: (44,) ScatWgt :fill_value = -9e+36 ; ScatWgt :missing_value = -9e+36 ; ScatWgt :var_desc = "ScatteringWeight " ; ScatWgt :long_name = "ScatWgt " ; ScatWgt :units = "NoUnits" ; ScatWgt :expression = "AveragingKernel = ScatWgt[:]/AmfTrop[:]" ; array: [0.22593703866004944 0.22593703866004944 0.18369300663471222 0.10553789138793945 0.015271992422640324 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0]

ak: [0.22593703866004944 0.22593703866004944 0.18369300663471222 0.10553789138793945 0.015271992422640324 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0]

So, this explains why the column density du calculated when applying this kernel is so much lower than the one calculated without applying the kernel:

    pp = (ppm * dp)
    dus = hPa_to_du * pp
    du = (dus * ak).sum(1, keepdims=True)

What I don't have a good feel for is whether the ak numbers above are realistic - if I interpret it correctly, this means that only the first (or last, this may be top-to-bottom indexing) five layers of CMAQ values are multiplied with a non-zero factor. Is this what we'd expect?

Thanks for helping me to figure this out, and sorry about posting incorrect information about the "old code" calculations last week.

chogrefe commented 3 years ago

Here are some updated plots illustrating that something doesn't seem to work correctly when applying an AK in cmaq2sat.py, comparing seasonal H-CMAQ tropospheric column NO2 processed by cmaq2sat.py with and without AK applied for two years:

2010:

image

2013:

image

A comparison against OMI data processed with sat2cmaq.py shows reasonable agreement with the H-CMAQ column processed without AK applied in terms of overall magnitude, confirming that it's the H-CMAQ column processed with AK applied that seems to be problematic:

2010:

image

2013:

image

barronh commented 3 years ago

Great sleuthing. I’ll have to look further at this. Are the new results using the hybrid wrf coordinate? I should have some time this coming week.

chogrefe commented 3 years ago

Thanks! Yes, all these runs use the new WRF hybrid vertical coordinate.

barronh commented 3 years ago

What values do the new files have in the VGLVLS property?

chogrefe commented 3 years ago

VGLVLS = 1.f, 0.9975f, 0.9946f, 0.9913f, 0.9875f, 0.9831f, 0.9781f, 0.9723f, 0.9657f, 0.958f, 0.9492f, 0.9391f, 0.9275f, 0.9141f, 0.8987f, 0.881f, 0.8607f, 0.8373f, 0.8104f, 0.7795f, 0.7439f, 0.7066f, 0.6693f, 0.632f, 0.5946f, 0.5573f, 0.52f, 0.4827f, 0.4454f, 0.4081f, 0.3708f, 0.3352f, 0.3013f, 0.269f, 0.2383f, 0.2089f, 0.181f, 0.1543f, 0.1289f, 0.1047f, 0.0816f, 0.0596f, 0.0386f, 0.0186f, 0.f

The calculation of the edge pressure and dp seems fine (except for the fact that of course with the hybrid coordinate system, the psurf/ptop/vglvls approximation is worse than before), it's the calculation of the akf array itself that may be problematic. here are sample pedged and dp values from the cmaq2sat.py calculation:

pedges: float PRSFC (TSTEP, LAY, ROW, COL); // shape: (45,) PRSFC :fill_value = -999.0 ; PRSFC :long_name = "PRSFC " ; PRSFC :units = "Pa " ; PRSFC :var_desc = "surface pressure " ; array: [101237.5078125 100996.9140625 100717.828125 100400.2421875 100034.5390625 99611.09375 99129.90625 98571.7265625 97936.5546875 97195.53125 96348.640625 95376.6484375 94260.2890625 92970.703125 91488.6484375 89785.2421875 87831.625 85579.6640625 82990.875 80017.140625 76591.0859375 73001.421875 69411.765625 65822.109375 62222.82421875 58633.16015625 55043.50390625 51453.84375 47864.1875 44274.52734375 40684.8671875 37258.8125 33996.359375 30887.888671875 27933.3984375 25104.015625 22418.98828125 19849.44921875 17405.015625 15076.0673828125 12852.98046875 10735.755859375 8714.767578125 6790.017578125 5000.0]

edges dp: float PRSFC (TSTEP, LAY, ROW, COL); // shape: (44,) PRSFC :fill_value = -999.0 ; PRSFC :long_name = "PRSFC " ; PRSFC :units = "Pa " ; PRSFC :var_desc = "surface pressure " ; array: [2.405937433242798 2.7908594608306885 3.1758594512939453 3.657031297683716 4.234453201293945 4.811874866485596 5.581796646118164 6.351718902587891 7.410234451293945 8.46890640258789 9.719922065734863 11.163593292236328 12.895859718322754 14.820547103881836 17.0340633392334 19.536170959472656 22.519609451293945 25.887889862060547 29.737342834472656 34.26054763793945 35.89664077758789 35.89656066894531 35.89656066894531 35.99285125732422 35.89664077758789 35.89656066894531 35.896602630615234 35.89656066894531 35.896602630615234 35.896602630615234 34.26054763793945 32.62453079223633 31.084707260131836 29.544902801513672 28.2938289642334 26.85027313232422 25.695390701293945 24.4443359375 23.28948211669922 22.23086929321289 21.1722469329834 20.209882736206055 19.247499465942383 17.900175094604492]

barronh commented 3 years ago

I may have it, and I should have noticed right away. The averaging kernel should be calculated as follows: ScatWgt[:]/AmfTrop[:, 0]

AmfTrop is stored in 3D, but only has valid values for the surface. This is a 2D variable, but IOAPI cannot mix dimensions in a file... So, the upper layers are just missing. So by using only layer one (AmfTrop[:, 0]), all layers of ScatWgt are divided by that the surface level from AmfTrop. This is what we wanted to happen.

The definition that you were using was ScatWgt[:]/AmfTrop[:]. This creates masked values at all but the first layer because the denominator is masked at all levels except the first. The reason you had non-zero layer above layer one is that the first OMI layer is then interpolated over CMAQ layers it overlaps.

The larger question of vertical interpolation is separate from this issue.

chogrefe commented 3 years ago

Thank you. Yes, this seems to do it, I've started to reprocess 2010 and the results for the first few days look good in terms of magnitude. I'll go ahead and reprocess all years, but I expect that this solves the problem for OMNO2d. The OMPROFOZ issue discussed in comments 3-4 seems separate so I'm leaving that alone for now.

barronh commented 3 years ago

I am sorry that the OMPROFOZ part of this has stalled. I've gone back and forth on several issues. At this point, the vertical coordinate and unit treatment of the averaging kernel is working. However, the final calculation is not for several reasons.

  1. As described in the user guide[1] and applied in my previous projects, the right way to apply the kernel for ozone is meaningfully different than other gases. The appropriate application is is $VCD = \sum_j{P_j + \sum_i{K_ij * (Y_i - P_i)}}$. This is very different than application than for the optically thin gases, but is easy enough to implement.
  2. The interpolation is more complex. The prior is in dobson units, which means that interpolation to the model grid would require either a mass conserving interpolation or conversion to mixing ratio (then back later). Alternatively, the CMAQ could be put on the satellite pressure grid. (It uses a pixel varying vertical grid that is not sigma or hybrid.)
  3. There may be a unit problem... This may just be a notes issue, but I am documenting it here, so I don't forget. The spec file[2] shows that the Averaging Kernel scaling factor should be 1e-4, but the downloaded files have 1e-5.

Any of these issues can be solved. What I am struggling with is how many exceptions to the rule should be implemented for a single data product. We may just want to apply the averaging kernel for OMPROFOZ in a separate process.

[1] https://avdc.gsfc.nasa.gov/pub/data/satellite/Aura/OMI/V03/L2/OMPROFOZ/OMPROFOZ_readme-v3.pdf [2] https://avdc.gsfc.nasa.gov/pub/data/satellite/Aura/OMI/V03/L2/OMPROFOZ/OMPROFOZ.pdf

barronh commented 2 years ago

I reached out to the developers of OMI PROFOZ. They confirmed that the correct scaling factor is 1e-4 and stated that the files will be corrected in an upcoming update. At the time of that discussion, the update was expected in June 2021.

barronh commented 5 months ago

This comment is relevant to v1. As of v2, it is not an issue.