JCSDA-internal / ioda-converters

Various converters for getting obs data in and out of IODA
9 stars 5 forks source link

GNU:AWS-gnu INTEL:AWS-intel CLANG:AWS-clang

ioda-converters

The converters can be built and tested using ioda-bundle. In ioda-bundle the build of the converters is disabled by default (for now) so you must enable the build using the BUILD_IODA_CONVERTERS directive. Here is an example:

git clone https://github.com/jcsda-internal/ioda-bundle
cd ioda-bundle
mkdir build
cd build
ecbuild -DBUILD_IODA_CONVERTERS=ON ..
make -j4
ctest

Note, you will need to add the following to your $PYTHONPATH in order to run the converters.

export PYTHONPATH=$PYTHONPATH:/<path_to_ioda-bundle_build>/lib/pyiodaconv
export PYTHONPATH=$PYTHONPATH:/<path_to_ioda-bundle>/iodaconv/src

gsi_ncdiag

These scripts use classes defined in the gsincdiag Python library to convert output from GSI netCDF diag files into IODA observation files and GeoVaLs for UFO. To run GSI and produce the necessary files, see the feature/files_for_jedi branch in the ProdGSI repository.

The following executable scripts are to be used by the user:

For developers, or for those who need to change the names of input/output variables in the scripts, see the README in src/gsi_ncdiag for details.

marine

The marine converters all take the following format, with some converters taking additional optional arguments as noted:

Usage: <converter.py> -i INPUT_FILE(S) -o OUTPUT_FILE -d YYYYMMDDHH

metar

A converter of Surface observation METAR reports in simple CSV format into IODA-ready netCDF4 file. Currently, the CSV-formatted file should contain a header line such as the following: Unix_time,DateString,ICAO,Latitude,Longitude,Elev,Temp,Dewp,Wdir,Wspd,Wgst,Vis,Pcp,Pcp3h,Pcp6h,Pcp24h,QcFlag,WxString,WxCode,Altimeter,Cvg1,Bas1,Cvg2,Bas2,Cvg3,Bas3,Length,Raw

At this time, only the output variables of air_temperature, surface_pressure (computed from altimeter setting), specific_humidity (computed from dewpoint temperature and surface_pressure), and eastward/northward_wind (computed from wind speed and direction) are output into the netCDF4 file. All initial values of PreQC are set to 2 (un-checked) and obserror=0 since UFO software (under development) will be used to flag/discard/QC/etc. these data. Other variables are expected to be converted as needed such as horizontal visibility in the near future.

Usage: <converter.py> -i INPUT_FILE(S) -o OUTPUT_FILE -d YYYYMMDDHH

compo

The compo converters include all converter scripts for aerosols and related chemistry variables.

For atmospheric composition netCDF files are supported with <instrument>_<species>_nc2ioda.py.

Usage: <instrument>_<species>_nc2ioda.py -i input_tropomi_files.nc -o output_ioda_file.nc

For -i you can specify a list of files with a shell wildcard and the converter will write them to one output file. This converter provides all fields needed for assimilation, including the observation value, error, and averaging kernel and a-priori information.

For AOD, viirs_aod2ioda.py, is used to convert the native netCDF format for observations of optical depth from VIIRS AOD550 to IODA netCDF format. Note that it takes only AOD550 explicitly and does not take the 11 AOD channels from VIIRS. The converter uses the following format to execute:

Usage: <converter.py> -i INPUT_FILE(S) -o OUTPUT_FILE -m nesdis -k maskout -t 0.0

For method option (-m) of bias and uncertainty calculation (default/nesdis), deafult means to set bias and uncertainty as 0.0 and nesdis means to use NESDIS bias and uncertainty calculation method. For maskout option (-k) default/maskout, default means to keep all missing values and maskout means to not write out missing values. For thinning option, the value should be within 0.0 and 1.0 depending how much data will be thinned, and 0.0 means without any thining.

land

The land converters include all converter scripts for snowpack, soil, vegeation, and the other surface related land variables.

For OWP snow observations (snow_obs), the converter converts daily csv file to netcdf files with owp_snow_obs.py.

usage: owp_snow_obs.py [-h] -i INPUT [-o OUTPUT] [--thin_swe THIN_SWE]
                       [--thin_depth THIN_DEPTH]
                       [--thin_random_seed THIN_RANDOM_SEED] [--err_fn ERR_FN]
optional arguments:
  --thin_swe THIN_SWE   percentage of random thinning for SWE, from 0.0 to 1.0.
                        Zero indicates no thinning is performed. (type: float,
                        default: 0.0)
  --thin_depth THIN_DEPTH
                        percentage of random thinning for snow depth, from 0.0
                        to 1.0. Zero indicates no thinning is performed. (type:
                        float, default: 0.0)
  --thin_random_seed THIN_RANDOM_SEED
                        A random seed for reproducible random thinning. Default
                        is total # seconds from 1970-01-01 to the day of the
                        data provided. (type: int, default: None)
  --err_fn ERR_FN       Name of error function to apply. The options are
                        hardcoded in the module, currently:['dummy_error'].
                        Default (none) uses ObsError column in the input file.
                        (type: str, default: None)

For snow cover fraction(scf), IMS grib2 files are supported with ims_scf2ioda.py.

Usage: ims_scf2ioda.py -i input_ims_file.grib2 -o output_ioda_file.nc -m maskout

For -i you can specify an input file and the converter will write it to one output file. For maskout option (-m) default/maskout, default means to keep all missing values and maskout means to not write out missing values.

For the processed imsfv3 snow depth and snow cover fraction, imsfv3 NetCDF file are supported with `imsfv3_scf2ioda.py.

Usage: imsfv3_scf2ioda.py -i input_imsfv3_file.nc -o output_ioda_file.nc

For -i you can specify an input file and the converter will write it to one output file when for -o you specify output ioda filename.

For snow depth (snod), afwa grib1 files are supported with afwa_snod2ioda.py.

Usage: afwa_snod2ioda.py -i input_afwa_file.grb -o output_ioda_file.nc -m maskout

For -i you can specify an input file and the converter will write it to one output file. For maskout option (-m) default/maskout, default means to keep all missing values and maskout means to not write out missing values.

It should be noted that both ims_scf2ioda.py and afwa_snod2ioda.py are depending on the python pygrib module. To enable the testing of these two scripts when the user has a pygrib module available, during the ecbuild process, please add -DUSE_PYGRIB=True to the ecbuild command line.

For snow depth (snod), GHCN csv files are supported with ghcn_snod2ioda.py.

Usage: ghcn_snod2ioda.py -i input_ghcn_file.csv -o output_ioda_file.nc -f ghcn_station.txt -d YYYYMMDD -m maskout

In the test case, YYYYMMDD is set 20200228. For -i you can specify an input file and the converter will write it to one output file. For fix file option (-f), you can specify fix station list file which includes station ID, latitude, longitude, and elevation. For maskout option (-m) default/maskout, default means to keep all missing values and maskout means to not write out missing values.

For both SMAP surface volumetric soil moisture (ssm), both 9km and NRT h5 files are supported with smap_ssm2ioda.py.

Usage: smap_ssm2ioda.py -i input_smap_file.h5 -o output_ioda_file.nc --maskMissing

For -i you can specify an input file and the converter will write it to one output file. --maskMissing means to not write out missing values. It should be noted that SMAP NRT h5 filename contains date and time which has been transferred to the datetime in smap_ssm2ioda.py because the data in the file does not have date and time variables. The h5 file is read with the netCDF4 module rather than the h5py module generally used.

For surface volumetric soil moisture (ssm), SMOS L2 NRT Netcdf files are supported with smos_ssm2ioda.py.

Usage: smos_ssm2ioda.py -i input_smos_file.nc -o output_ioda_file.nc -m maskout

For -i you can specify an input file and the converter will write it to one output file. For maskout option (-m) default/maskout, default means to keep all missing values and maskout means to not write out missing values. Here soil moisture with negative values is also not written out.

For surface soil moisture normalized (ssm), ASCAT L2 NRT Netcdf files are supported with ascat_ssm2ioda.py.

Usage: ascat_ssm2ioda.py -i input_smos_file.nc -o output_ioda_file.nc -m maskout

For -i you can specify an input file and the converter will write it to one output file. For maskout option (-m) default/maskout, default means to keep all missing values and maskout means to not write out missing values.

GOES

The GOES converter classes generate two IODAv2 data files from a group of raw data files for all 16 channels of GOES-16 or GOES-17 LB1 products. The final result of this class is two IODAv2 formatted data files - one for Reflectance Factor (RF, ABI channels 1-6) and one for Brightness Temperature (BT, ABI channels 7-16). Since GOES-16 and GOES-17 are in a geostationary orbit, auxiliary files containing relevant variables and attributes for latitude, longitude, and various angles are accessed (or created if it does not exist) through the latlon_file_path input argument for each satellite. This converter checks to see if the nadir for each satellite has changed and will create a new latlon file if a nadir change has occurred.

Usage   goes_converter = GoesConverter(input_file_paths, latlon_file_path, output_file_path_rf, output_file_path_bt, include_rf, resolution)
        goes_converter.convert()

Where   input_file_paths - A list of the absolute paths to all 16 ABI channels from the same hour
        latlon_file_path - The path to an existing GoesLatLon file or if it does not exist the path to write the file
        output_file_path_rf - The path to write the IODAv2 reflectance factor data file
        output_file_path_bt - The path to write the IODAv2 brightness temperature data file
        include_rf - Boolean value indicating whether to create the reflectance factor output data file: False (default)
        resolution - The resolution in km: 2 (default), 4, 8, 16, 32, 64