CEMPD / VERDI

This is the repo for the VERDI project, written in java.
GNU General Public License v3.0
16 stars 13 forks source link

Failed to load GOES-16/-17 GLM datasets (in NetCDF format) in VERDI 20230425 builds #321

Open yadongxuEPA opened 1 year ago

yadongxuEPA commented 1 year ago

Describe the bug Test VERDI 2.1.4 20230425 builds on Atmos Could not add GOES-16/-17 GLM datasets (in NetCDF format) under "Datasets" pane.

To Reproduce Steps to reproduce the behavior:

  1. Launch VERDI GUI
  2. Click on "+" under "Datasets" pane 3-1. Browse to GLM datasets on Atmos in this directory : /work/MOD3DEV/dkj/GLM/DATA/GOES/GLM16_FGsubset/20210501 choose "GLM16-Flash_2021050100.nc" by clicking on "Open" 3-2. Browse to GLM datasets on Atmos in this directory : /work/MOD3DEV/dkj/GLM/DATA/GLM16_FsubsetX/20190101 choose "GLM16-Flash_2019010100.nc" by clicking on "Open"

Expected behavior The selected dataset names should be displayed under "Datasets" pane.

Screenshots No error messages displayed, but the dataset was not loaded to VERDI for "GLM16-Flash_2021050100.nc"

test_satellite_data_4 An error message popped-up and no dataset is loaded to VERDI for "GLM16-Flash_2019010100.nc" test_satellite_data_2

yadongxuEPA commented 1 year ago

Retested VERDI 2.1.4 20230517 builds on Atmos, found that VERDI can not add & display GOES-16/-17 GLM datasets (in NetCDF format) under "Datasets" pane. It behaved the same as the previous VERDI 2.1.4 20230425 builds. Test_Github_321_1

lizadams commented 1 year ago

I am seeing the same behavior using VERDI_2.1.4_mac_20230526.tar.gz, when I try to load the file, VERDI fails to load the file, but produces no error message or pop-up window.

I am using this GLM Lightning data

GOES-16/-17 GLM datasets: /work/MOD3DEV/dkj/GLM/DATA/GOES/ 

/work/MOD3DEV/dkj/GLM/DATA/GOES/GLM16_FGsubset/20210501

GLM16-Flash_2021050100.nc

GLM16-Flash_2021050101.nc

GLM16-Flash_2021050102.nc

Do we have another method to view this data?

I tried using ncview,

The message that ncview gives is as follows:

Note: the coordinates attribute for variable flash_area is being ignored, since it specifies a variable (flash_time_offset_of_last_event) that has 1 effective dims (an effective dim has a size greater than 1) I am not set up to handle cases with coordinate mapping using anything other than 0 or 2 effective dims Note: the coordinates attribute for variable flash_energy is being ignored, since it specifies a variable (flash_time_offset_of_last_event) that has 1 effective dims (an effective dim has a size greater than 1) I am not set up to handle cases with coordinate mapping using anything other than 0 or 2 effective dims Note: the coordinates attribute for variable flash_quality_flag is being ignored, since it specifies a variable (flash_time_offset_of_last_event) that has 1 effective dims (an effective dim has a size greater than 1) I am not set up to handle cases with coordinate mapping using anything other than 0 or 2 effective dims Note: no Ncview app-defaults file found, using internal defaults

There is a flash_lat variable and a flash_lon variable, but I don't know how to get ncview to recognize it.

I also tried Panopoly, and it also just produced a 1 - Dim Plot of the flash area (I was unable to produce a tile type of plot).

Using the METCRO3D_20110728* file, I was able to get Panopoly to offer to create a Georeferenced Color Contour Plot.

The variable names for longitude and latitude are not standard in the file.

ncdump GLM16-Flash_2021050103.nc | more

    float flash_lat(number_of_flashes) ;
            flash_lat:_FillValue = NaNf ;
            flash_lat:long_name = "GLM L2+ Lightning Detection: flash centroid (mean constituent event latitude weighted by their energies) latitude coordinate" ;
            flash_lat:standard_name = "latitude" ;
            flash_lat:units = "degrees_north" ;
            flash_lat:axis = "Y" ;

    float flash_lon(number_of_flashes) ;
            flash_lon:_FillValue = NaNf ;
            flash_lon:long_name = "GLM L2+ Lightning Detection: flash centroid (mean constituent event latitude weighted by their energies) longitude coordinate" ;
            flash_lon:standard_name = "longitude" ;
            flash_lon:units = "degrees_east" ;
            flash_lon:axis = "X" ;

Panopoly can create a Georeferenced lat_lon color contour plot for the METCRO3D_20110728_m3wndw.nc file, but not for the GLM16-Flash_2021050103.nc file.

Panopoly_Georeferenced_lat_lon_color_contour_plot_menu_option

ncview plot of the GLM16-Flash_2021050103.nc file:

ncview_GLM_Lightning_Data_Plot
lizadams commented 1 year ago

I found the following reader for level 2 GLM files in the geo2grid documentation. https://www.ssec.wisc.edu/software/geo2grid/readers/glm_l2.html

lizadams commented 1 year ago

I found the following in the yaml file that is used in geo2grid. reader: name: glm_l2 short_name: GLM Level 2 longname: GOES-R GLM Level 2 description: > NetCDF4 reader for GOES-R series GLM data. Currently only gridded L2 files output from `gltmtools https://github.com/deeplycloudy/glmtools` are supported.

So, now I am investigating glmtools.

https://github.com/deeplycloudy/glmtools/blob/master/docs/index.rst

The following paper describes the GLM L2 Data. https://agupubs.onlinelibrary.wiley.com/doi/10.1029/2019JD030874

lizadams commented 1 year ago

I was able to use jupyter notebook contained within glmtools to regrid data to a GEOS domain and then visualize the netCDF file using Panopoly after running the post-processing jupyter notebook script provided here: https://github.com/deeplycloudy/glmtools/blob/master/examples/plot_glm_test_data.ipynb

I don't think that VERDI can read in raw GLM files. Perhaps we can provide a jupyter notebook example that would do the following: (the order of these steps likely needs to be modified) 1) aggregate 1 min GLM data files into an hourly time series dataset by accumulating 1 min GLM files into 60 minutes. 2) glob the 1 hour files together to create a file with 24 timesteps 3) interpolate from the GEOS grid to the CMAQ CONUS grid for lambert-conformal projection (or other projection type supported by I/O API and VERDI.

Once the above steps have been done, then we should be able to view the lightning data using VERDI.

Other example scripts that discuss the need to estimate the height of a cloud that is observed lightning: parallax-corrected-latlon.ipynb

I have created a tar.gz file containing output files created by the following script: https://github.com/deeplycloudy/glmtools/blob/master/examples/plot_glm_test_data.ipynb GLM-L2-regrid-using-glmtools.tar.gz

I can visualized these files in panopoly, but not in VERDI. VERDI gives the following error.

VERDI_error_projection_not_recognized

I think if we modify the python notebook examples to show how to regrid to a CONUS grid using a lambert conformal projection, then VERDI may be able to visualize the output.

Panopoly Create Plot options:

Panopoly_Create_Plot_Options

Panopoly X-Y Plot

Panopoly_x_y_plot

Panopoly lat-lon Plot

Panopoly_lat_lon_plot
lizadams commented 1 year ago

Additional resources

Quick Guide to GLM Lightning Mapper https://www.star.nesdis.noaa.gov/goes/documents/GLM_Quick_Guides_May_2019.pdf

SWIFT Short Course https://training.eumetsat.int/pluginfile.php/51243/course/section/4850/SIFT_short_course_20230531.pdf

Panopoly - Java Based Visualization Tool

https://www.giss.nasa.gov/tools/panoply/

yadongxuEPA commented 1 year ago

I tested with the new build VERDI_2.1.4_linux64_20230803.tar.gz on Atmos. 1) added this data file /work/MOD3DEV/dkj/GLM/DATA/GOES/20210501/OR_GLM-L2-LCFA_G16_s20211210000000_e20211210000204_c20211210000216.nc to VERDI GUI. 2) A pop-up window (the CSV dialog) showed up and I selected the corresponding variables from the drop-down menu list. image 3) Then the satellite data file was added to “Datasets” panel and “flash_energy” showed up under “Variables”. However, the current workflow treats the raw satellite data as an observational dataset, so it does not support "Tile Plot" directly. After clicking "Tile Plot" tab, an error message showed up as below: image 4) So we need to load another dataset to be used as "model" data. At this point, we don't have a modeled dataset that can meet the following requirements: a ) the modeled time range covers the time range when the raw satellite data was collected; b) the modeled spatial coverage contains the locations of the satellite data points; c) contains a variable that has comparable scales with the lightning or flash-related variables. After discussing with Daiwen, we decided that it will be very difficult for users to use the current workflow to visualize the raw satellite data files. We need to re-think another workflow (ideally not requiring any modeled data).