csiro-coasts / emsarray

xarray extension that supports EMS model formats
BSD 3-Clause "New" or "Revised" License
13 stars 2 forks source link

ACCESS file is unrecognised #84

Closed frizwi closed 1 year ago

frizwi commented 1 year ago

access_hdr.txt

Calling emsarray.open_dataset() on an ACCESS file results in

`/tools/Anaconda3/2022.10/lib/python3.9/site-packages/emsarray/accessors.py in ems_accessor(dataset) 18 format_class = get_file_format(dataset) 19 if format_class is None: ---> 20 raise RuntimeError("Could not determine format of dataset") 21 return format_class(dataset)

RuntimeError: Could not determine format of dataset ` Looking at the code, seems like it should be CFGrid1D - what's tripping it up? netCDF header attached

frizwi commented 1 year ago

If there are some attributes missing, is there a workaround whereby xarray.dataset() is called on the file and virtually modified before calling emsarray,make_clip_mask()?

frizwi commented 1 year ago

netCDF file can be downloaded from here: https://filesender.aarnet.edu.au/?s=download&token=677248cf-0a0d-4335-b603-af2b9df2f159

mx-moth commented 1 year ago

I'll have a look at the access file details on Monday. As a work around, you can use Convention.bind() to set which convention is used for a dataset

mx-moth commented 1 year ago

Looking at the CF Conventions on the latitude / longitude coordinates, the correct way to mark these coordinates is with a units: degrees_north / units: degrees_east as appropriate. They also list some acceptable but discouraged aliases, and some alternative attributes.

The latitude / longitude variables are detected in the CFGridTopology class. This detection routine does not follow the CF conventions correctly. It looks for some of the acceptable alternate attributes, and checks for units: degree_north. It does not check for the preferred units: degrees_north (note plural degrees). I will update the CFGridTopology class to match the CF Conventions correctly.

In the mean time, a workable alternative is to manually bind a convention to the dataset. As the automatic detection of the latitude / longitude variables fails, you will need to manually specify the latitude / longitude coordinate names:

import xarray
from emsarray.conventions.grid import CFGrid1D

# Open the dataset and manually make a Convention instance
dataset = xarray.open_dataset(...)
convention = CFGrid1D(dataset, latitude='lat', longitude='lon')

# Bind the Convention instance to dataset so that dataset.ems works
convention.bind()

# Use the dataset as normal
dataset.ems.plot()