Closed pernak18 closed 6 years ago
there's not a lot i can do with 4. -- pfile and vmrfile have similar pressures but with different precisions, so it's hard to do any kind of equality check. right now i just check to see that the pressure arrays are the same size
for item 1, the only relevant molecules ("double agents", as i call them -- XS and line parameters exist for both) are NO2, SO2, CF4, and HNO3. i've added to the code what i think is required to handle these and ran a couple tests (NO2 for 1550-1600 cm-1 [line params] and 34690-34800 cm-1 [XS]; HNO3 for 200-600 cm-1 [XS] and 1550-1600 cm-1 [lines]). everything is working as i'd expect in these cases
it should be noted that NO2 and SO2 have identical density profiles regardless of whether XS or line parameters are used. that is not the case for HNO3 and CF4. this is handled in the code.
for item 7, i used this quick script:
#!/usr/bin/env python
allowed = ['H2O', 'CO2', 'O3', 'N2O', 'CO', 'CH4', 'O2', \
'NO', 'SO2', 'NO2', 'NH3', 'HNO3', 'OCS', 'H2CO', 'N2', \
'HCN', 'C2H2', 'HCOOH', 'C2H4', 'CH3OH', 'CCL4', 'CF4', \
'F11', 'F12', 'F22', 'ISOP', 'PAN', 'HDO', 'BRO', 'O2-O2']
import pandas as pd
inCSV = '/home/pernak18/work/ABSCO/VMR/USS_AIRS_profile.csv'
csvDat = pd.read_csv(inCSV)
csvNames = csvDat.keys().values
for name in allowed:
print(name, name in csvNames)
if name not in csvNames:
hiName = '%s_HI' % name
xsName = '%s_XS' % name
print('\t', hiName in csvNames, xsName in csvNames)
and the only guys that "fail" are F22, HDO, BRO, and O2-O2. the latter three i expect because they are special cases that i have to address eventually anyway, and F22 just has an alias (CHCLF2) that i need to use have addressed in the makeABSCO constructor.
for 10, we should just have to include some provision in ABSCO_preprocess.py that breaks up the bands if necessary, then just proceed as we do when the bands are smaller than 2000 cm-1.
for 9, we decided to continue to do one layer (2 levels) at a time in each TAPE5, while providing the entire user profile for each. this has produced successful LBLRTM runs for all H2O pressures and temperatures (and note that with these code changes, we are processing a different number of temperatures per level)
see https://github.com/pernak18/ABSCO/commit/2697366806b7f1dbb568062ddcbce1a4e85299bf
for 6, i did a trial run with CO2 200-600 cm-1 and found that up to 14 GB of RAM were being utilized degrading by a factor of 4.
this can actually be improved upon -- there is no reason to store the wavenumber array associated with the OD array for each LBL run. for a given band, wavenumbers will always be the same.
for item 10, i tested the following configuration:
wn1 = 500 9000 100
wn2 = 600 800 200
res = 1e-4 1e-4 1e-4
degrade = 4 8 2
and everything looked good.
for 13, see:
https://www.unidata.ucar.edu/blogs/developer/entry/chunking_data_why_it_matters https://www.unidata.ucar.edu/blogs/developer/en/entry/chunking_data_choosing_shapes https://www.oreilly.com/library/view/python-and-hdf5/9781491944981/ch04.html https://www.unidata.ucar.edu/blogs/developer/entry/netcdf_compression http://unidata.github.io/netcdf4-python/#section9
https://stackoverflow.com/questions/38860344/how-to-set-chunk-size-of-netcdf4-in-python https://stackoverflow.com/questions/46951981/create-and-write-xarray-dataarray-to-netcdf-in-chunks https://stackoverflow.com/questions/12067876/handling-very-large-netcdf-files-in-python
i'm pretty sure i'm doing the optimal thing right now. the python netCDF4 library does compress by default and also applies a default chunking size. if we want to manually change the chunking sizes, more experimentation is needed. xarray also has not added any efficiency or compressed the output file any smaller. i did this for ozone 500-600 cm-1, 1e-4 cm-1 resolution, and factor of 4 degradation and the file is 624M with compression and chunking. that might not seem like a lot, but remember the bandwidth is only 100 cm-1 and some molecules have another (H2O_VMR) dimension, so this file size can balloon.
for item 8, O2-O2 is already accounted for in the O2 continuum
for item 16, i now stack all of the spectra on top of each other and utilize an Extent_Indices array as James originally suggested. this works out well -- trying to do the band dimension that i was doing when there were bands of unequal size (either inconsistent ranges or resolutions) was a headache.
for item 17, RAM usage is another prompt at the beginning of the code. there may be too many prompts, but it will be easy to remove them if requested. the RAM usage assumes a full run (all pressures, temperatures, bands, and WV VMR) for a given molecule
for item 19, see https://github.com/pernak18/ABSCO/commit/5027f51516989f39f1e1e81968f6931964d2ad6f
Items in ABSCO_tables.py that i have started working on but still need work and need to be addressed before the code delivery:
species that have both line parameters and XS files (_HI and _XS in USS_AIRS_profile.csv, which is generated with standard_atm_profiles.py)User profile XS (records 3.7 and 3.7.1)H2O scaling with PWVEnsure consistency between P and profile CSV fileSave pressure layer values and write them to the eventual netCDF file (in addition to the level values)with the TES XS ABSCO library (https://lex-gitlab.aer.com/RC/ABSCO_XS), we save the OD files, then do the post-processing (ABSCO calculation), but this is likely not necessary and consumes a lot of hard drive space, so i think we should not be doing this for the general ABSCO tables task. instead, we'll just save the ODs in memory in the processing(and we'll need to warn about RAM requirements)Verify that I guarantee consistency between the molecule names in standard_atmprofiles.py and ABSCO*.pyO2 dimer,BrO, heavy water special casesInclude all pressures in same TAPE5 rather than doing a single layer at a time (like we did with the previous XS ABSCO software)Library should break up spectral regions into bands of 2000 cm-1.Work WV VMR into CO2 and N2 processing because of H2O effects on continuumSeparate H2O VMR run in calcABSCO() then assemble ABSCO array with both runsChunking in the output netCDFVerify valid data ranges (for output netCDF)wavelength to wavenumber conversionremove band/range dimension (read_ABSCO_tables.py, makeNC() in ABSCO_tables.py, netCDF templates?)RAM computation and warningcontinue updating README.mdmove ABSCO_tables.py main() function into its own driver script (run_lbl_absco.py)add O2 dimensionnotification if-e2e
,-lbl
, and-lnfl
are not provided