zxdawn / S5P-WRFChem

Core code for the TROPOMI NO2 retrieval based on WRF-Chem outputs
GNU General Public License v3.0
13 stars 2 forks source link

Small AMFs compared with official AMFs #5

Closed zxdawn closed 3 years ago

zxdawn commented 4 years ago

Describe the bug As shown in #4, retrieved AMFs (AMF CHEM) are smaller than official AMFs (AMF S5P).

Expected behavior Retrieved AMF should be similar to official AMF over clear pixels.

Cause This may be caused by the wrong anthropogenic emissions in WRF-Chem. I will replace MEIC 2016 with new data including PM2.5, NOx, and SO2 data.

Updates (Solution)

There're two main parameters causing this issue:

  1. The bAMFs in the LUT file are normalized by amf_geo
  2. The TROPOMI product previous to version 2 used apparent_scene_pressure instead of cloud_pressure_crb, when crf is larger than 0.9999. However, we used the cloud_pressure_crb directly from the version 2 test data, as the retrieval method has been improved for high clouds.
zxdawn commented 4 years ago

Here's the comparison between simulation (with MEIC 2016) and EMC station observation: image

zxdawn commented 4 years ago

Both AMFClr and AMFCld are smaller than the official AMFs.

Another bug is the tropopause pressures which are mostly between 96 hPa to 97.5 hPa. That's totally wrong!

debug

zxdawn commented 4 years ago

The tropopause index (starting from 0) is right.

And according to this ACP paper, the smaller AMFs are common for urban areas.

It's better to ask for the a priori profiles in the official data to compare with WRF-Chem NO2 profiles.

zxdawn commented 4 years ago

Attach the question email with Jos below:

To make sure the basic algorithm in my script is correct, I used a priori profiles to compute some variables again for one pixel. My AMFClr is 0.668 which is smaller than the official one (0.964), shown in the title of figure.

no2_profile_1

Here're steps I tried to figure out the problem:

1. Integration method is correct

To make sure my integration method is correct, I compared Ghost and vcdGnd (from the surface to the tropopause). vcdGnds are the same while my Ghost value is a little larger because of one more layer included. Anyway, that won't affect the comparison of AMFClr. The difference in AMFClr is caused by smaller scdClr because we have the same vcdGnd.

2. The abnormal sharp decrease in bAMFClr near the ground (panel b)

I used linear interpolation to get the bAMFClr at the model middle pressure levels (full levels). Then, I checked the bAMF in the LUT file by choosing the parameters similar to these of the pixel and two different surface pressures (978 hPa and 1013 hPa).

When the surface pressure (998.5 hPa in this case) is between these two pressure levels, the linear interpolation will lead to a sharp decrease in bAMF. How did you deal with this?

image image

3. Smooth bAMFClr (panel b)

I manually assign the two bAMF near the ground to make the bAMFClr line looks smooth. But, the scdClr and AMFClr (0.738) are still smaller than the official values.

no2_profile_2

4. The interpolation method (panel b)

I set the bAMFClr to 1.02 for all pressure levels and found the AMFClr (0.814) is still smaller ...

no2_profile_3

5. Correct method?

The temperature correction factor and the no2 profile looks well. So, I suppose the only possible issue is bAMFClr, but AMFClr in step 4 indicates that it's not the only issue.

Script

Click to check the script

```python import os import scipy import xarray as xr import proplot as plot import numpy as np from matplotlib.offsetbox import AnchoredText def integPr(no2_profile, p_level): # no2_profile: full level; p_level, half level subcolumn = 10 * R *T0 /(g0*p0) \ * no2_profile*1e6 \ * abs(p_level.diff('dim_0')) * 2.6867e16 # DU to moleclues/cm2 return subcolumn def cal_bamf(lut, albedo, cld_albedo, p_sfc, p_cld, dphi, mu0, mu): '''Calculate the Box-AMFs based on the LUT file Args: - albedo: Surface albedo - cld_albedo: Cloud albedo - p_sfc: Surface pressure - p_cld: Cloud pressure - dphi: Relative azimuth angle - mu: Cosine of viewing zenith angle - mu0: Cosine of solar zenith angle - amf: Box air mass factor ''' new_dim = ['x'] # get vars albedo = xr.DataArray(albedo, dims=new_dim) cld_albedo = xr.DataArray(cld_albedo, dims=new_dim) dphi = xr.DataArray(dphi, dims=new_dim) mu0 = xr.DataArray(mu0, dims=new_dim) mu = xr.DataArray(mu, dims=new_dim) rza = 180 - abs(lut['vza'] - lut['sza']) da = lut['amf'].assign_coords(p=np.log(lut['amf'].p), p_surface=np.log(lut['amf'].p_surface)) bAmfClr_p = da.interp(albedo=albedo, p_surface=np.log(p_sfc), dphi=dphi, mu0=mu0, mu=mu, ) bAmfCld_p = da.interp(albedo=cld_albedo, p_surface=np.log(p_cld), dphi=dphi, mu0=mu0, mu=mu, ) # interpolate to TM5 full pressure levels bAmfClr = bAmfClr_p.interp(p=np.log(p_full_level)) bAmfCld = bAmfCld_p.interp(p=np.log(p_full_level)) return bAmfClr, bAmfCld # input data for one specific pixel itropo = 21 p_tropo = 90.98486 p_cld = 164.55855 p_sfc = 998.5987 albedo = 0.26507807 cld_albedo = 1.0 dphi = 157.53604 mu0 = 0.9399877 mu = 0.9665278 vcdStrato = 3121409916973061.5 # constant R = 287.3 T0 = 273.15 g0 = 9.80665 p0 = 1.01325e5 ts = 220 # temperature of cross-section [K] # read lut lut = xr.open_dataset('./s5p_data/S5P_OPER_LUT_NO2AMF_00000000T000000_99999999T999999_20160527T173500.nc') # half level p_level = np.array([9.98598694e+02, 9.90822266e+02, 9.72398499e+02, 9.44450623e+02, 9.03277039e+02, 8.45252808e+02, 7.85315308e+02, 6.94897278e+02, 6.14334473e+02, 5.32470337e+02, 4.55775391e+02, 3.72291473e+02, 3.14727386e+02, 2.76442657e+02, 2.42013336e+02, 2.21057053e+02, 1.92411560e+02, 1.66872025e+02, 1.44183777e+02, 1.30521729e+02, 1.12055374e+02, 9.58255463e+01, 8.61441727e+01, 7.31186905e+01, 6.15607414e+01, 4.49081726e+01, 3.38174362e+01, 2.26543159e+01, 1.42377014e+01, 8.23967838e+00, 4.27592516e+00, 1.91338563e+00, 6.95205748e-01, 1.86089307e-01, 0.00000000e+00]) no2_profile = np.array([3.94732104e+03, 1.66271289e+03, 1.10038196e+03, 7.64539978e+02, 3.32407166e+02, 1.37028000e+02, 8.97676926e+01, 7.64529037e+01, 6.59814987e+01, 6.35099525e+01, 5.71410294e+01, 7.18070984e+01, 1.17980721e+02, 1.30275177e+02, 1.24757011e+02, 1.13242134e+02, 1.00192368e+02, 8.08833313e+01, 6.17135696e+01, 4.13122444e+01, 3.51919518e+01, 5.98989258e+01, 1.07456207e+02, 2.67693268e+02, 6.29109680e+02, 1.23314221e+03, 2.08980884e+03, 3.54495215e+03, 5.35783594e+03, 5.41150000e+03, 4.14452295e+03, 1.52739197e+03, 1.01487373e+02, 1.51449704e+00]) no2_profile /= 1e12 tk = np.array([305.6537 , 304.41522, 302.5105 , 300.01678, 296.76825, 293.02716, 288.26202, 283.01044, 277.59723, 271.19284, 263.02283, 254.49118, 247.03717, 239.86603, 233.88919, 227.82767, 220.2498 , 212.72021, 206.55151, 200.73172, 196.35721, 196.09137, 198.13571, 203.49054, 209.8208 , 213.96767, 218.09471, 224.22435, 229.3142 , 235.83282, 251.07927, 260.7121 , 250.09435, 229.61119]) # convert to DataArray and calculate full pressure level no2_profile = xr.DataArray(no2_profile) p_level = xr.DataArray(p_level) # half level (upper and lower level) p_full_level = p_level.rolling({'dim_0': 2}).mean()[1:, ...] p_sfc = xr.DataArray(p_sfc) p_cld = xr.DataArray(p_cld) tk = xr.DataArray(tk) # the temperature correction factor, see TROPOMI ATBD file factor = 1 - 0.00316*(tk-ts) + 3.39e-6*(tk-ts)**2 # calculate box-AMF bAmfClr, bAmfCld = cal_bamf(lut, np.array([albedo]), np.array([cld_albedo]), p_sfc, p_cld, np.array([dphi]), np.array([mu0]), np.array([mu])) # assign bAMF manually #bAmfClr.load()[-1] = 1.02 #bAmfClr.load()[-1][1] = 0.87 #bAmfClr.load()[-1][0] = 0.86 # get the scattering weights clearSW = no2_profile * bAmfClr * factor cloudySW = no2_profile * bAmfCld * factor # integration tropo_layer = ((p_level >= p_tropo) & (p_level <= p_sfc))[:-1] ghost_layer = ((p_level >= p_cld ) & (p_level <= p_sfc))[:-1] subcolumn = integPr(no2_profile, p_level) vcdTotal = subcolumn.sum('dim_0') vcdGnd_official = vcdTotal - vcdStrato vcdGnd = subcolumn.where(tropo_layer).sum('dim_0') ghost = subcolumn.where(ghost_layer).sum('dim_0') subcolumn = integPr(clearSW, p_level) scdClr = subcolumn.where(tropo_layer).sum('dim_0') amfClr = scdClr/vcdGnd # ---------- plot functions ------------ f, axs = plot.subplots(nrows=2, ncols=2, share=0) axs.format(abc=True, abcstyle='(a)') # -- subplot1 -- ax = axs[0] x = no2_profile*1e12 y = p_full_level ax.plot(x, y) # plot no2 at all full levels and ghost s_all = ax.scatter(x, y, size=8, label='no$_2$ at full levels') # plot circles in different colors s_tropo = ax.scatter(x.where(tropo_layer), y.where(tropo_layer), size=12, facecolors='none', edgecolor='red5', label='no$_2$ between surface and tropopause') s_ghost = ax.scatter(x.where(ghost_layer), y.where(ghost_layer), size=15, facecolors='none', edgecolor='green5', label='no$_2$ between surface and cloud') # plot the levels l_tropo = ax.axhline(p_tropo, color='red5', label='Tropopause') l_cld = ax.axhline(p_cld, color='green5', label='Cloud') l_sfc = ax.axhline(p_sfc, color='orange5', label='Surface') for half_p in p_level: l_half = ax.axhline(half_p, color='black', linestyle='--', linewidth=0.1, label='Half levels') ax.legend([s_all, s_tropo, s_ghost, l_tropo,l_cld, l_sfc, l_half], loc='l', ncols=1, fontsize=8) # set axis ax.format(xlim=(0, 5000), ylim=(1.02e3, 50), grid=False, ylabel='Pressure (hPa)', xlabel='NO$_2$ (pptv)', ) # add annotation amf_result = 'amfClr Official: 0.9644712 | Xin: {:.3f}'.format(amfClr.values[0]) ghost_result = '\n Ghost Official: 4.291e+15 | Xin: ' + '{:.3e}'.format(ghost.values) vcd_result = '\n vcdGnd Official: {:.3e} | Xin: {:.3e}'.format(vcdGnd_official.values, vcdGnd.values) scd_reault = '\n scdClr official = amfClr(official)*vcdGnd(official) = {:.3e} | Xin = amfClr*vcdGnd = {:.3e}'.format(0.9644712*vcdGnd_official.values, scdClr.values[0]) content = amf_result + ghost_result + vcd_result + scd_reault axs.format(suptitle=content) #at = AnchoredText(content, # loc='upper left', prop=dict(size=8), frameon=True, # bbox_to_anchor=(0, 1.4), # bbox_transform=ax.transAxes # ) #at.patch.set_boxstyle("round,pad=0.,rounding_size=0.2") #ax.add_artist(at) # -- subplot2 -- ax = axs[1] x = bAmfClr.isel(x=0) y = p_full_level ax.plot(x, y) ax.scatter(x, y, size=8) ax.format(xlabel='bAMFClr', ylim=(1.02e3, 50)) # -- subplot3 -- ax = axs[2] x = factor y = p_full_level ax.plot(x, y) ax.scatter(x, y, size=8) ax.format(xlabel='factor', ylim=(1.02e3, 50)) # -- subplot4 -- ax = axs[3] x = clearSW.isel(x=0)*1e12 y = p_full_level ax.plot(x, y) ax.scatter(x, y, size=8) ax.format(xlabel='(1e-12) clearSW = NO$_2$ * bAmfClr * factor', ylim=(1.02e3, 50), xlim=(0, 1800)) f.savefig('no2_profile.png') ```

zxdawn commented 4 years ago

For the smooth bAMFClr, we can replace zero values with nan and extrapolate the data.

It seems the important issue is the distribution of bAMF.

Here's an example of my bAMFClr (0.925 ~ 1.05) and bAMFCld: bAMF

The decreasing trend of bAMFCld looks reasonable if I check the AMF by Panoply software: Screenshot from 2020-10-29 16-10-46

I checked some distributions of AMF in others' papers and their AMFs are larger than 1.

  1. Alba Lorente's paper µ0 = µ = 0.8 (θ0 = θ = 37◦), ϕ = 60◦, surface albedo = 0.05, surface pressure = 1013 hPa: image

  2. Song Liu's paper: They focus on GOME-2 and parameters are: The cloud radiance fraction is 0.47, the cloud optical depth is 6.85, the SZA is 69◦, the VZA is 3◦, and the RAA is 42◦ image

Here's the comparison between the official one and my AMFs: Official Xin
AMFClr 0.7697352766990662 0.7343801936146834
AMFCld 0.1417810022830963 0.06731122269090602
zxdawn commented 3 years ago

Update: I asked for help from Ronald van der A, but he said nothing is wrong with my script.

Looking forward to the response from Henk Eskes after the Christmas holiday.

zxdawn commented 3 years ago

Response from Ronald:

  1. The first thing is the most important: Henk told me that the LUT consists of normalized AMFs, meaning that it contains the AMF/AMFgeo. The geometrical AMFgeo equals 1/mu+1/mu0. Henk is using this to have smoother interpolations and after the interpolations he multiplies it with the AMFgeo again. This explains about a factor 2 difference already.

  2. The interpolation at the surface can be tricky when you have pressure levels below ground level, so it is useful to carefully check this for the AMFclr. This has caused problems for us in the past.

  3. An easy way to validate your bAMF is by plotting both the averaging kernel and the bAMF/(AMFstrat+AMFtrop). These quantities should be exactly equal. Therefore you can see if you have a deviating layer or not.

zxdawn commented 3 years ago

CC my response here:

I checked the official bAMF using averaging kernel for CRF=0 pixel. Then, I open the L2 file by Panoply software directly by choosing the parameters and got the distribution of bAMF in the LUT. LUT_bAmf image

As Henk said, the bAMF in the LUT is normalized, I divide the bAMF derive from the L2 product by AMF_geo, as shown in the first figure. L2_bAmf

However, the values are different! Do you have any idea which is wrong?

I attach the output of the script here:

crf 0.0 surface_albedo 0.20823705196380615 relative_azimuth_angle 18.895615 cosine_solar_zenith_angle 0.9362971 cosine_viewing_zenith_angle 0.9133534 bAMF_normalized [0.5374612 0.5618768 0.81474555 0.8910857 0.9932143 1.1032735 1.2353718 1.3787956 1.5107466 1.6356691 1.7675163 1.8851482 1.9726447 2.0425434 2.0980964 2.1528034 2.2161083 2.2793055 2.3293574 2.3706686 2.3976104 2.4060323 2.3889966 2.346349 2.2956784 2.254053 2.2268665 2.1776745 2.1387117 2.0977757 1.9989262 1.9279331 2.0097158 2.1230757 ]

zxdawn commented 3 years ago

CC my new response here:

Sorry for the misleading ... I replotted the bAMFs.

Here's the explanation of each line:

1) bAMF interpolated from the LUT nc file using six parameters from the L2 product 2) bAMFamf_geo. Because Henk said the bAMF in the LUT is normalized, I multiply it to get the not-normalized value. 3) bAMFamf_geo*T_factor. As the bAMF in the L2 product seems to be temperature corrected, I added it here. 4) bAMF calculated only using the variables in L2 product.

Hope it's clear now ;) A large difference exists in each layer when 3) is compared with 4).

BTW, I compared the normalized bAMFs in the last email because I don't want my interpolation method to cause any error. Actually, it's not related to the interpolation method. The 1) profile is similar to the heatmap (around 1 between 800 hPa and 200 hPa) from the Panoply software output.

So, there may be something wrong with either the LUT data or the L2 product? Otherwise, they should coincide well together.

bAMFs

zxdawn commented 3 years ago

CC my new response here:

Dear Henk,

I realize I made a mistake of surface albedo! I used a variable called surface_albedo instead of surface_albedo_nitrogendioxide_window.

After correcting the issue, I got a similar bAMF profile. However, there are still some deviations that are larger with larger pressure. bAMFs_coorect_albedo

Besides, I compared the interpolation of bAMFs without limiting pressure levels to TM5 pressure levels and the direct selection of bAMFs by the nearest method. bAMFs_interp_nearest

So, the kink is caused by the interpolation. I also tried cubic interpolation instead of linear interpolation, the result is the same. Of course, I can set the 0 values to nan and use extrapolation to get a smooth curve. But, the kink also exists in the amf_total*Ak. Does that mean my interpolation method is correct?

Glad that we are close to the correct reproduction now!

zxdawn commented 3 years ago

CC my new response here:

Dear Henk,

You are right. The kink is abnormal and can lead to a large deviation for the high cloud with large CRF.

I picked one cloudy pixel as an example. The large differences exist near the cloud pressure level. bAMFs_cld_pixel

My interpolation method is interpolating the AMF in LUT to pressure full levels (middle of the base and top pressures at each layer). Besides, the bAMFCld is interpolated by replacing surface_albedo_nitrogendioxide_window and surface_pressure with cloud_albedo_crb and cloud_pressure_crb, respectively.

Is there anything wrong with my method?

zxdawn commented 3 years ago

This problem seems to be caused by the different cloud pressure used in the retrievals.

The official product seems to use apparent_scene_pressure instead of cloud_pressure_crb.

CC my response here:

I'm afraid the TROPOMI product uses the apparent_scene_pressure instead of cloud_pressure_crb. I checked the other two high cloud pixels and they all show the results based on apparent_scene_pressure coincide with the averaging kernel well.

I attach the three exported netCDF files in case you want to check. Basic information (ground_pixel, scanline, crf, apparent_scene_pressure, and cloud_pressure_crb) is shown as the title of each figure.

test 1

bAMFs_CP1

test 2

bAMFs_CP2

test 3

bAMFs_CP3

zxdawn commented 3 years ago

Finally, we found out the cause:

They are both correct.

For the official calculation an exception is built in for cloud fractions higher than (or equal to) 1. These are the saturated pixels, often with lightning, that you are interested in. Since the cloud fraction can not be trusted in that case, the apparent scene pressure is used to have a better estimate.

This is exactly how it is calculated:

       ! Cloud model switch to scene pressure mode in case of cloud fractions > 1.0                                                                                                                 
       do iObs = 1, NObs                                                                                                                                                                            
          if ( no2Tr%cloudFraction(iObs) > 0.9999 ) then                                                                                                                                            
             no2Tr%cloudFraction(iObs) = 1.0                                                                                                                                                        
             no2Tr%cloudRadianceFraction(iObs) = 1.0                                                                                                                                               
             no2Tr%cloudTopPressure(iObs) = no2Tr%scenePressure(iObs)                                                                                                                              
             no2Tr%cloudAlbedo(iObs) = min(1.0,no2Tr%sceneAlbedo(iObs))                                                                                                                            
          end if                                                                                                                                                                                    
       end do                                                                                                                                                                                       

However, in the test version you are using, the cloud parameters are corrected and the cloud fraction is below 1.0, so no corrections are needed and the standard averaging kernel would be correct: so your calculation is correct. If you will check non-saturated pixels you will find a one-on-one correspondence again.

For the next version (v2) of the TROPOMI NO2 product these issues will be solved anyway.