Closed HolgerPollyNet closed 2 years ago
"We realized that in P3.+ some variables are now float instead of double, might this cause the problem and shall we change this back @ZPYin ?"
Here the differences are listed:
v2.0 | v3.1
<variable name="LR_aeronet_1064" shape="method" type="double"> | <variable name="LR_aeronet_1064" shape="method" type="float">
<variable name="LR_aeronet_355" shape="method" type="double"> | <variable name="LR_aeronet_355" shape="method" type="float">
<variable name="LR_aeronet_532" shape="method" type="double"> | <variable name="LR_aeronet_532" shape="method" type="float">
<variable name="RH" shape="height" type="double"> | <variable name="RH" shape="height" type="float">
<variable name="WVMR" shape="height" type="double"> | <variable name="WVMR" shape="height" type="float">
<variable name="aerBsc_RR_1064" shape="height" type="double"> | <variable name="aerBsc_RR_1064" shape="height" type="float">
<variable name="aerBsc_aeronet_1064" shape="height" type="double"> | <variable name="aerBsc_aeronet_1064" shape="height" type="float">
<variable name="aerBsc_aeronet_355" shape="height" type="double"> | <variable name="aerBsc_aeronet_355" shape="height" type="float">
<variable name="aerBsc_aeronet_532" shape="height" type="double"> | <variable name="aerBsc_aeronet_532" shape="height" type="float">
<variable name="aerBsc_klett_1064" shape="height" type="double"> | <variable name="aerBsc_klett_1064" shape="height" type="float">
<variable name="aerBsc_klett_355" shape="height" type="double"> | <variable name="aerBsc_klett_355" shape="height" type="float">
<variable name="aerBsc_klett_532" shape="height" type="double"> | <variable name="aerBsc_klett_532" shape="height" type="float">
<variable name="aerBsc_raman_1064" shape="height" type="double"> | <variable name="aerBsc_raman_1064" shape="height" type="float">
<variable name="aerBsc_raman_355" shape="height" type="double"> | <variable name="aerBsc_raman_355" shape="height" type="float">
<variable name="aerBsc_raman_532" shape="height" type="double"> | <variable name="aerBsc_raman_532" shape="height" type="float">
<variable name="aerExt_RR_1064" shape="height" type="double"> | <variable name="aerExt_RR_1064" shape="height" type="float">
<variable name="aerExt_raman_1064" shape="height" type="double"> | <variable name="aerExt_raman_1064" shape="height" type="float">
<variable name="aerExt_raman_355" shape="height" type="double"> | <variable name="aerExt_raman_355" shape="height" type="float">
<variable name="aerExt_raman_532" shape="height" type="double"> | <variable name="aerExt_raman_532" shape="height" type="float">
<variable name="aerLR_RR_1064" shape="height" type="double"> | <variable name="aerLR_RR_1064" shape="height" type="float">
<variable name="aerLR_raman_1064" shape="height" type="double"> | <variable name="aerLR_raman_1064" shape="height" type="float">
<variable name="aerLR_raman_355" shape="height" type="double"> | <variable name="aerLR_raman_355" shape="height" type="float">
<variable name="aerLR_raman_532" shape="height" type="double"> | <variable name="aerLR_raman_532" shape="height" type="float">
<variable name="altitude" shape="method" type="double"> | <variable name="altitude" shape="method" type="float">
<variable name="end_time" shape="method" type="double"> <variable name="end_time" shape="method" type="double">
<variable name="height" shape="height" type="double"> | <variable name="height" shape="height" type="float">
<variable name="latitude" shape="method" type="double"> | <variable name="latitude" shape="method" type="float">
<variable name="longitude" shape="method" type="double"> | <variable name="longitude" shape="method" type="float">
<variable name="parDepol_klett_355" shape="height" type="double"> | <variable name="parDepol_klett_355" shape="height" type="float">
<variable name="parDepol_klett_532" shape="height" type="double"> | <variable name="parDepol_klett_532" shape="height" type="float">
<variable name="parDepol_raman_355" shape="height" type="double"> | <variable name="parDepol_raman_355" shape="height" type="float">
<variable name="parDepol_raman_532" shape="height" type="double"> | <variable name="parDepol_raman_532" shape="height" type="float">
<variable name="pressure" shape="height" type="double"> | <variable name="pressure" shape="height" type="float">
<variable name="reference_height_1064" shape="reference_height" type="double"> | <variable name="reference_height_1064" shape="reference_height" type="float">
<variable name="reference_height_355" shape="reference_height" type="double"> | <variable name="reference_height_355" shape="reference_height" type="float">
<variable name="reference_height_532" shape="reference_height" type="double"> | <variable name="reference_height_532" shape="reference_height" type="float">
<variable name="shots" shape="method" type="double"> | <variable name="shots" shape="method" type="short">
<variable name="start_time" shape="method" type="double"> <variable name="start_time" shape="method" type="double">
<variable name="temperature" shape="height" type="double"> | <variable name="temperature" shape="height" type="float">
<variable name="uncertainty_parDepol_klett_355" shape="height" type="double"> | <variable name="uncertainty_parDepol_klett_355" shape="height" type="float">
<variable name="uncertainty_parDepol_klett_532" shape="height" type="double"> | <variable name="uncertainty_parDepol_klett_532" shape="height" type="float">
<variable name="uncertainty_parDepol_raman_355" shape="height" type="double"> | <variable name="uncertainty_parDepol_raman_355" shape="height" type="float">
<variable name="uncertainty_parDepol_raman_532" shape="height" type="double"> | <variable name="uncertainty_parDepol_raman_532" shape="height" type="float">
<variable name="volDepol_klett_355" shape="height" type="double"> | <variable name="volDepol_klett_355" shape="height" type="float">
<variable name="volDepol_klett_532" shape="height" type="double"> | <variable name="volDepol_klett_532" shape="height" type="float">
<variable name="volDepol_raman_355" shape="height" type="double"> | <variable name="volDepol_raman_355" shape="height" type="float">
<variable name="volDepol_raman_532" shape="height" type="double"> | <variable name="volDepol_raman_532" shape="height" type="float">
<variable name="zenith_angle" shape="method" type="double"> | <variable name="zenith_angle" shape="method" type="float">
... and indeed, that caused the problem!
I just tested it for the generated profile-file in the picasso-proc-chain (lib/io/pollySaveProfiles.m) by setting the type of every variable to 'double' and reprocessed a level0 file to level1 file. Afterwards I executed the polly2scc script with this new level1-nc-file and no error occured!
Here is what we found out so far: the following error will be printed, when trying to execute polly2scc.py with picasso_v3.1 nc-files:
result = np.where(m, fa, umath.power(fa, fb)).view(basetype)
Traceback (most recent call last):
File "polly2scc.py", line 2319, in <module>
main()
File "polly2scc.py", line 2314, in main
args.range_lim_b, args.range_lim_e, args.camp_info, args.force)
File "polly2scc.py", line 2185, in polly2scc
dims, data, global_attris = p2e_convertor.read_data_file(task)
File "polly2scc.py", line 307, in read_data_file
self.__read_picasso_results(filename, **kwargs)
File "polly2scc.py", line 1220, in __read_picasso_results
temperature=pData['temperature'][refMask355] + 273.16))
File "<__array_function__ internals>", line 6, in nanmean
File "C:\Users\baars\.conda\envs\polly2earlinet_db\lib\site-packages\numpy\lib\nanfunctions.py", line 951, in nanmean
avg = _divide_by_count(tot, cnt, out=out)
File "C:\Users\baars\.conda\envs\polly2earlinet_db\lib\site-packages\numpy\lib\nanfunctions.py", line 213, in _divide_by_count
return np.divide(a, b, out=a, casting='unsafe')
ValueError: output array is read-only
Furthermore I found out, that in the function „beta_pi_rayleigh“, which is implemented from the molecular.rayleigh_scattering, returns an empty masked_array for the picasso_v3.1-files. If I compress this array to the masked range of this array by using 'masked_array.compressed()':
refBscMol355 = np.nanmean(
beta_pi_rayleigh(
355,
pressure=pData['pressure'][refMask355],
temperature=pData['temperature'][refMask355] + 273.16).compressed())
the polly2scc script will generate the earlinet-nc-file, but with an empty 'backscatter_calibration_value'.
Afterwards I checked the function „sigma_rayleigh“ :
def sigma_rayleigh(wavelength, pressure=1013.25, temperature=288.15, C=385., rh=0.):
''' Calculates the Rayleigh-scattering cross section per molecule.
Parameters
----------
wavelength: float
Wavelegnth [nm]
pressure: float
The atmospheric pressure [hPa]
temperature: float
The atmospheric temperature [K]
C: float
CO2 concentration [ppmv].
rh: float
Relative humidity from 0 to 100 [%]
Returns
--------
sigma: float
Rayleigh-scattering cross section [m2]
'''
p_e = rh_to_pressure(rh, temperature)
# Just to be sure that the wavelength is a float
wavelength = np.array(wavelength, dtype=float)
# Calculate properties of standard air
n = n_air(wavelength, pressure, temperature, C, rh)
N = number_density_at_pt(pressure, temperature, rh, ideal=ASSUME_AIR_IDEAL)
# Wavelength of radiation
wl_m = wavelength * 10 ** -9 # from nm to m
# King's correction factor
f_k = kings_factor_atmosphere(wavelength, C=C, p_e=p_e, p_t=pressure) # no units
# first part of the equation
f1 = (24. * np.pi ** 3) / (wl_m ** 4 * N ** 2)
# second part of the equation
f2 = (n ** 2 - 1.) ** 2 / (n ** 2 + 2.) ** 2
# result
sigma = f1 * f2 * f_k
return sigma
I found out at which point the masked_array gets empty. This is when f1 will be calculated, when N gets squared!
N will be calculated by:
N = number_density_at_pt(pressure, temperature, rh, ideal=ASSUME_AIR_IDEAL)
If setting 'ideal' (compressibility of air) to False, the masked_array gets filled, when setting it to True (=?ASSUME_AIR_IDEAL), it will be empty for the v_3.1-nc-files. For the v_2.0-nc-files the polly2scc-script succeeded, independend of setting 'ideal' to True or to False.
@ZPYin ... do you have any clue what caused this problem in the function mentioned above?
Afterwards I checked the function „sigma_rayleigh“ :
def sigma_rayleigh(wavelength, pressure=1013.25, temperature=288.15, C=385., rh=0.): ''' Calculates the Rayleigh-scattering cross section per molecule. Parameters ---------- wavelength: float Wavelegnth [nm] pressure: float The atmospheric pressure [hPa] temperature: float The atmospheric temperature [K] C: float CO2 concentration [ppmv]. rh: float Relative humidity from 0 to 100 [%] Returns -------- sigma: float Rayleigh-scattering cross section [m2] ''' p_e = rh_to_pressure(rh, temperature) # Just to be sure that the wavelength is a float wavelength = np.array(wavelength, dtype=float) # Calculate properties of standard air n = n_air(wavelength, pressure, temperature, C, rh) N = number_density_at_pt(pressure, temperature, rh, ideal=ASSUME_AIR_IDEAL) # Wavelength of radiation wl_m = wavelength * 10 ** -9 # from nm to m # King's correction factor f_k = kings_factor_atmosphere(wavelength, C=C, p_e=p_e, p_t=pressure) # no units # first part of the equation f1 = (24. * np.pi ** 3) / (wl_m ** 4 * N ** 2) # second part of the equation f2 = (n ** 2 - 1.) ** 2 / (n ** 2 + 2.) ** 2 # result sigma = f1 * f2 * f_k return sigma
I found out at which point the masked_array gets empty. This is when f1 will be calculated, when N gets squared! N will be calculated by:
N = number_density_at_pt(pressure, temperature, rh, ideal=ASSUME_AIR_IDEAL)
If setting 'ideal' (compressibility of air) to False, the masked_array gets filled, when setting it to True (=?ASSUME_AIR_IDEAL), it will be empty for the v_3.1-nc-files. For the v_2.0-nc-files the polly2scc-script succeeded, independend of setting 'ideal' to True or to False.
You are right!!! @ulysses78
Aftering switching to float
in Picass v3.x, the variables were saved as numpy.float32
in the python conversion script, which were inherited from the nc file. However, it can cause overflow
in calculating Rayleigh scattering properties. The strange behavior pointed out by @ulysses78 is exactly where overflow
happens.
The value of atmospheric molecule number density (N) is (usually) at the order of magnitude of 25, while numpy.float32
covers the range of [-3.410^(-38), 3.410^38]. It causes error when executing N**2
.
I fixed this bug in the lastest commit. You can check the details inside.
@ZPYin and @ulysses78
Guys, i just wanted to say that you doing a great job. And fixing such nasty bug is really not easy! Many thanks!
The following error occurs:
File "polly2scc.py", line 1220, in read_picasso_results temperature=pData['temperature'][refMask355] + 273.16)) File "<array_function__ internals>", line 6, in nanmean
obviously caused by function "sigma_rayleigh" as @ulysses78 found out.
We realized that in P3.+ some variables are now float instead of double, might this cause the problem and shall we change this back @ZPYin ?