CEMPD / VERDI

This is the repo for the VERDI project, written in java.
GNU General Public License v3.0
16 stars 13 forks source link

No loaders are available for the selected file #340

Open lizadams opened 7 months ago

lizadams commented 7 months ago

Is your feature request related to a problem? Please describe. VERDI is unable to recognize file that was created by Barron Henderson's shape2cmaq utility. This utility will be available in an upcoming release of CMAQ (5.5). This issue was reported by Manish Soni at UNC IE.

Describe the solution you'd like Would like VERDI to be able to load either of these files and create tile plot.

I am including two files in the loader_test.zip, one that was generated by the shape2cmaq utility that VERDI can't load.

108US1_IOAPI.nc

A second file that was created by running the I/O API tool m3xtract to convert the above file to one that can be loaded by VERDI.

108US1_IOAPI.ioapi.nc

loader_test.zip

There is something in the file created by the shape2cmaq utility, that makes VERDI not be able to load the file. I am attaching the difference in the headers. sdiff_out.txt

The main difference that I see is that the dimensions of the file created by the shape2cmaq tool has two extra dimensions (tnv and nv). Perhaps this causes the verdi data loaders to not recognize it? If so, we could provide this information to Barron Henderson, or we can modify VERDI to load the dataset with the extra dimensions and to ignore them.

netcdf \108US1_IOAPI { dimensions: TSTEP = UNLIMITED ; // (1 currently) DATE-TIME = 2 ; VAR = 17 ; LAY = 1 ; ROW = 50 ; COL = 60 ; tnv = 2 ; nv = 4 ;

The header information from the two files is contained in the header.tool.txt and header.ioapi.txt files below:

ncdump -h 108US1_IOAPI.nc > header.tool.txt
ncdump -h 108US1_IOAPI.ioapi.nc > header.ioapi.txt

header.tool.txt

header.ioapi.txt

I have also tried to put the contents of the command side by side in the following excel spreadsheet. The order is not exactly the same as what is listed in txt files created using the command above, but I wanted to see the contents side by side as much as possible.

comparison_header_ioapi_vs_tool.xlsx

Once loaded VERDI created a tile plot of the MEX variable from the 108US1_IOAPI.ioapi.nc file :

verdi.sh -f $cwd/108US1_IOAPI.ioapi.nc -s "MEX[1]" -g tile
108US1_IOAPI ioapi MEX

The error message when trying to load the 108US1_IOAPI.nc is in the gui: "No loaders are available for the selected file"

systemsgo commented 5 months ago

Verdi expects to read NetCDF files containing gridded data, but NetCDF is not seeing any grids in 108US1_IOAPI.nc. While the data in the two files is the same, many of the global attributes are different, including the metadata that NetCDF uses to discover the grids.

There are several different ways NetCDF files can be organized. One way is described as conventions - information about NteCDF conventions is here:

https://www.unidata.ucar.edu/software/netcdf/conventions.html

The Conventions attribute in 108US1_IOAPI.nc is set to CF-1.6. However the attributes required by the CF-1.6 convention don't seem to be set. The working file - 108US1_IOAPI.ioapi.nc - isn't using the Conventions attribute at all. It follows a standard called IOAPI, and includes a variable called GDTYP, which is one of the attributes used to describe the grid contained in the file.

This link contains information about the IOAPI standard: https://www.cmascenter.org/ioapi/documentation/all_versions/html/BINIO.html

I believe the simplest way to get the shape2cmaq output recognized would be to include the grid metadata as described in the IOAPI standard. Without this information there's no way for VERDI or any other tool to correctly georeference the data within the file.

barronh commented 5 months ago

I've seen this sort of thing before. If you drop the Conventions attribute, then NetCDF-java works with it. The file already has all the IOAPI meta-data, but the NetCDF-java defers to Conventions. And, we have stripped out the CF metadata, but left in the Conventions attribute. I suspect if you use ncatted to delete the Conventions attribute, VERDI will work.

systemsgo commented 5 months ago

Strictly speaking, the shape2cmaq output is a perfectly valid NetCDF file, and NetCDF-java has no problem reading it. The thing is, NetCDF is just a container format - like a fancy version of zip or tar. Things like ncatted, ncdump, and NetCDF-Java work at the NetCDF container level, like zip or tar executables, and have no problem with it.

Programs to interact with the contents of NetCDF files - like Verdi or QGIS - use NetCDF-Java or another NetCDF api to open the NetCDF container, but they also have to understand the structure of what's inside. Conventions describe several commonly supported structures. IOAPI is another supported structure. In Verdi's case, it tried all of its modules for data stored in NetCDF files, and none were able to read it. The CF module failed because it doesn't actually contain CF data. The IOAPI module failed because of the missing GDTYP variable, which is part of what describes the gridded data within an IOAPI file. There's a pretty big size difference between the file from the shape2cmaq utility and the one from m3xtract, so I'm sure there are other differences besides just the GDTYP variable. Comparing the output of ncdump -h on both files is a quick way to find some of those differences, and the IOAPI spec that I posted above will help explain what's present in the m3xtract output but missing from shape2cmaq.

barronh commented 5 months ago

As I said, I have run into this before.

I developed the UAMIV reader to netCDF-java, which relies on the M3IO conventions. Having done so, I am familiar with the conventions precedence and assignment. In fact, I have tracked the isMine precedence with respect to IOAPI vs CF and why it fails. Because CF isMine is checked first, and then the CF convention is not honored it fails.

Both files in the zip have GDTYP as an attribute. You can see that in the txt files, or by running ncdump or however you want to check attributes. The file size difference is because the original file is netcdf4-classic with compression, but the m3xtract is netcdf3-classic which does not support compression.

My very first suggestion (before it ever made it here) was to remove the Conventions attribute. It looks like that has not been tested. I just deleted the Conventions attribute and installed VERDI and visualized the USA variable. A screenshot is put inline.

image

To delete the Conventions attribute in python:

import netCDF4
f = netCDF4.Dataset('108US1_IOAPI.nc', 'r+')
del f.Conventions
f.close()
systemsgo commented 5 months ago

I took another look - I do see the GDTYP attribute - not sure what I was looking at before. Going into the Verdi code, I see that both the CFLoader and Models3Loader and classes trigger calls to NetcdfDatasetFactory.createDatasets(), which calls a getGrids() method on a NetCDF-Java class called ucar.nc2.dt.grid.GridDataset. In the case of the shape2cmaq output, both calls to getGrids() return empty lists, so Verdi has nothing to display. If removing or altering the Conventions attribute in the IOAPI file allows the grids to be detected correctly, that may be an option worth exploring. I can't think of a way for Verdi to safely decide when to or how to alter user supplied input.

barronh commented 5 months ago

You don’t need to change anything. The code that made it just needs to be tweaked.