Applying PET to ocean model data (ROMS)

dgwyther commented 2 years ago

Hello,

I'm trying to apply your toolbox to some ocean model data. I have looked through the documentation and at some of the recent issues raised (e.g. the POP3 issue), but I don't think my question has been addressed.

I have ocean model data from ROMS (e.g. sea surface height over a year of model integration). The grid is slightly rotated clockwise and has slowly varying horizontal resolution. As a result, the lon and lat coordinates are 2-D. For example, the longitude coordinate (on this Arakawa-C grid) is:

lon_rho (eta_rho, xi_rho)
long_name : longitude of RHO-points
units : degree_east
standard_name : longitude
field : lon_rho, scalar
float64

I have tried to format my data to match the example data and the data in the POP3 issue, however, I'm not sure I can match the data format with 1-dimensional coordinates. Possibly I can't apply the py-eddy-tracker to this model data?

I have dumped the nc-dump of data here:

netcdf test_roms_output_truth_8013 {
dimensions:
    ocean_time = UNLIMITED ; // (0 currently)
    time = 31 ;
    latitude = 317 ;
    longitude = 272 ;
    xi_u = 271 ;
    eta_v = 316 ;
variables:
    float zeta(time, latitude, longitude) ;
        zeta:_FillValue = 1.e+37f ;
        zeta:long_name = "free-surface" ;
        zeta:units = "meter" ;
        zeta:time = "ocean_time" ;
        zeta:grid = "grid" ;
        zeta:location = "face" ;
        zeta:field = "free-surface, scalar, series" ;
        zeta:coordinates = "lon_rho lat_rho ocean_time" ;
    double h(latitude, longitude) ;
        h:_FillValue = NaN ;
        h:long_name = "bathymetry at RHO-points" ;
        h:units = "meter" ;
        h:grid = "grid" ;
        h:location = "face" ;
        h:field = "bath, scalar" ;
        h:coordinates = "lon_rho lat_rho" ;
    double lon_bnds(latitude, xi_u) ;
        lon_bnds:_FillValue = NaN ;
        lon_bnds:long_name = "longitude of U-points" ;
        lon_bnds:units = "degree_east" ;
        lon_bnds:standard_name = "longitude" ;
        lon_bnds:field = "lon_u, scalar" ;
        lon_bnds:comments = "longitude of U-points on arakawa-C grid" ;
    double lat_bnds(eta_v, longitude) ;
        lat_bnds:_FillValue = NaN ;
        lat_bnds:long_name = "latitude of V-points" ;
        lat_bnds:units = "degree_north" ;
        lat_bnds:standard_name = "latitude" ;
        lat_bnds:field = "lat_v, scalar" ;
        lat_bnds:comments = "latitude of V-points on arakawa-C grid" ;
    double long(latitude, longitude) ;
        long:_FillValue = NaN ;
        long:long_name = "longitude of RHO-points" ;
        long:units = "degree_east" ;
        long:standard_name = "longitude" ;
        long:field = "lon_rho, scalar" ;
    double lat(latitude, longitude) ;
        lat:_FillValue = NaN ;
        lat:long_name = "latitude of RHO-points" ;
        lat:units = "degree_north" ;
        lat:standard_name = "latitude" ;
        lat:field = "lat_rho, scalar" ;
    double time(time) ;
        time:_FillValue = NaN ;
        time:long_name = "time since initialization" ;
        time:field = "time, scalar, series" ;
        time:units = "seconds since 1990-01-01" ;
        time:calendar = "proleptic_gregorian" ;

// global attributes:
        :file = "outer_his.nc" ;
        :format = "netCDF-3 64bit offset file" ;
        :Conventions = "CF-1.4, SGRID-0.3" ;
        :type = "ROMS/TOMS history file" ;

        :coordinates = "lon_bnds lat long lat_bnds" ;
}

I've also attached this testing file of ROMS data if it helps to clarify whether PET can be applied to this file.

Many thanks in advance, Dave

test_roms_output_truth_8013.nc.zip .

AntSimi commented 2 years ago

Hi, thanks for your interest, Closest issue is #5, but to do eddy detection PET(Py-Eddy-Tracker) need a speed field to select speed contour and to provide speed profile. I want to improve Py-Eddy-Tracker for unregular grid / model (arakawa grid like BCDE). So i am available and interested to help you to use PET toolbox for this type of grid.

dgwyther commented 2 years ago

Hi @AntSimi My apologies for not noticing issue #5 . I should have scrolled back further through the closed issues.

I had added u and v velocities to the test output file. I also changed the attributes (units of velocity to 'meter second^-1'). I included u_eastward_surf and v_northward_surf velocities - that is, velocities at the same points as the sea level values (the rho point of the C grid), and I kept only the surface layer for space requirements.

(I made a few changes to attributes and rename coordinates and variables to get this output file - Mainly changing some of the Arakawa-C grid and ROMS terminology to something that I guess PET recognises. I'm more than happy to share the 'pre-processing' code.)

Very keen to work together to get PET working on some ROMS output!

Cheers Dave test_roms_output_truth_8013.nc.zip .

AntSimi commented 2 years ago

Did you succeed to run eddy detection with your updated NetCDF file? I am interested to read you preprocessing code and also to see raw ROMS netcdf structure, to see how minimize or delete changes to use with PET or create a specific class to use ROMS output. I also want to know, if you agree, if i can use your ROMS file to create an example to give hints on using PET with ROMS in this gallery

dgwyther commented 2 years ago

I have just had a go at trying to load and detect with the updated NetCDF file, with the following code:

g = RegularGridDataset(
    "/Users/dave/Documents/dave/Projects/OSSE-eddies/data/proc/test_roms_output_truth_8013.nc",
    "long",
    "lat",
)

But an error was raised with the following message: Coordinates in RegularGridDataset must be 1D array, or think to use UnRegularGridDataset.

I then tried with UnRegularGridDataset which seemed to work. Contiuing on with trying to replicate the Gulf Stream example, I tried to plot the field:

margin = 30
g = UnRegularGridDataset(
    "/Users/dave/Documents/dave/Projects/OSSE-eddies/data/proc/test_roms_output_truth_8013.nc",
    "long",
    "lat",
)

ax = start_axes("ZETA (m)")
m = g.display(ax, "zeta", vmin=-1, vmax=1, cmap="RdBu_r")

great_current = Contours(g.x_c, g.y_c, g.grid("adt"), levels=(0.35,), keep_unclose=True)
great_current.display(ax, color="k")
update_axes(ax, m)

But received an error that [UnRegularGridDataset' object has no attribute 'display]().

edit: I just read issue #143 . I understand this error. My next comment has progress.

Yes, I'm happy for you to use that ROMS file - although I might just strip some of the extra metadata out of it first ;)

dgwyther commented 2 years ago

I had success this afternoon using the UnRegularGridDataset class but ignoring some of the commands which exist for RegularGridDataset (but not for the irregular version), for example, .display and .add_uv.

Eddy identification works well based on supplying the zeta, u_eastward and v_northward fields.

Next I will try and extend this simple script to look at more than 1 timestep and compute some simple statistics (planning on base it on this tutorial.

AntSimi commented 2 years ago

I was previously more focus on RegularGrid that why lot of method miss in UnRegularGrid class, I add display method for UnRegularGrid.

When you run identification you must take care at few parameters :

step (0 to infinite)=> height between 2 isolines (a low value will slow detection, a high value will miss eddy, typical value is something from 1 mm to 5 mm)
shape_error => higher value will allow shape with more sharp (Typical value is around 70%)

In area with great current like gulf stream, great current create a high gradient in height which could be a problem for detection with height, this problem could be avoid with a step of filtering.

hgrosselindemann commented 2 years ago

Hello, I am trying to apply the PET to an unregular ocean model as well, it is a NEMO configuration. I followed the steps from David and the identification worked, thanks for that. However, when I want to apply the filtering first, I get an error in the .high_filter function:

g = UnRegularGridDataset('/gxfs_work1/geomar/smomw484/data/eddy_tracking_ssh/ssh_u_v_1d_20140101_20141231_NWA.nc',
                      'nav_lon',
                      'nav_lat',
                        centered=True,indexs=dict(time_counter=0))
g.high_filter('ssh',700)

----> 1 g.high_filter('ssh',700)

File ~/miniconda3/envs/myenv/lib/python3.8/site-packages/pyEddyTracker-3.6.0+8.gaf0d6af-py3.8.egg/py_eddy_tracker/dataset/grid.py:580, in GridDataset.high_filter(self, grid_name, w_cut, **kwargs)
    574 def high_filter(self, grid_name, w_cut, **kwargs):
    575     """Return the high-pass filtered grid, by substracting to the initial grid the low-pass filtered grid (default: order=1)
    576 
    577     :param grid_name: the name of the grid
    578     :param int, w_cut: the half-power wavelength cutoff (km)
    579     """
--> 580     result = self._low_filter(grid_name, w_cut, **kwargs)
    581     self.vars[grid_name] -= result

File ~/miniconda3/envs/myenv/lib/python3.8/site-packages/pyEddyTracker-3.6.0+8.gaf0d6af-py3.8.egg/py_eddy_tracker/dataset/grid.py:1132, in UnRegularGridDataset._low_filter(self, grid_name, w_cut, factor)
   1129 m = ~z_flat.mask
   1130 x_flat, y_flat, z_flat = x_flat[m], y_flat[m], z_flat[m]
-> 1132 nb_value, _, _ = histogram2d(x_flat, y_flat, bins=bins)
   1134 sum_value, _, _ = histogram2d(x_flat, y_flat, bins=bins, weights=z_flat)
   1136 with errstate(invalid="ignore"):

File <__array_function__ internals>:180, in histogram2d(*args, **kwargs)

File ~/miniconda3/envs/myenv/lib/python3.8/site-packages/numpy/lib/twodim_base.py:819, in histogram2d(x, y, bins, range, normed, weights, density)
    817     xedges = yedges = asarray(bins)
    818     bins = [xedges, yedges]
--> 819 hist, edges = histogramdd([x, y], bins, range, normed, weights, density)
    820 return hist, edges[0], edges[1]

File <__array_function__ internals>:180, in histogramdd(*args, **kwargs)

File ~/miniconda3/envs/myenv/lib/python3.8/site-packages/numpy/lib/histograms.py:1020, in histogramdd(sample, bins, range, normed, weights, density)
   1017 except (AttributeError, ValueError):
   1018     # Sample is a sequence of 1D arrays.
   1019     sample = np.atleast_2d(sample).T
-> 1020     N, D = sample.shape
   1022 nbin = np.empty(D, int)
   1023 edges = D*[None]

ValueError: too many values to unpack (expected 2)

I managed to fix it by copying the code and changing it to: x_flat, y_flat, z_flat = np.squeeze(x_flat[m]), np.squeeze(y_flat[m]), np.squeeze(z_flat[m]) I dont know, if that is a specific error to my dataset or a general one, just wanted to mention it to you. Also, I guess just adding the squeeze shouldnt really lead to any problems.

Thank you in advance and best regards Hendrik

AntSimi commented 2 years ago

Could you share ncdump -h of your netcdf file?

hgrosselindemann commented 2 years ago

netcdf ssh_u_v_1d_20140101_20141231_NWA {
dimensions:
    time_counter = 365 ;
    x = 550 ;
    y = 700 ;
variables:
    float ssh(time_counter, x, y) ;
        ssh:_FillValue = NaNf ;
        ssh:units = "meter" ;
        ssh:coordinates = "nav_lat nav_lon" ;
    float u(time_counter, x, y) ;
        u:_FillValue = NaNf ;
        u:units = "meter second^-1" ;
        u:coordinates = "nav_lat nav_lon" ;
    float v(time_counter, x, y) ;
        v:_FillValue = NaNf ;
        v:units = "meter second^-1" ;
        v:coordinates = "nav_lat nav_lon" ;
    float nav_lon(x, y) ;
        nav_lon:_FillValue = NaNf ;
    float nav_lat(x, y) ;
        nav_lat:_FillValue = NaNf ;
    int64 time_counter(time_counter) ;
        time_counter:units = "days since 2014-01-01 12:00:00" ;
        time_counter:calendar = "proleptic_gregorian" ;
}

However, there are some attributes missing from the original model output. This is just a very quick combination of the ssh,u,v output I did with xarray and manually specified the units to test if the identification works at all before doing it properly

AntSimi commented 2 years ago

I don't understand why this dataset(variable/dimension) produce this issue, are you ok to share this file?

hgrosselindemann commented 2 years ago

ssh_u_v_1d_20140101_NWA.zip Of course,here it is. I extracted the first day of the year. Thank you so much for your fast response!

AntSimi commented 2 years ago

Ok i understand now, it come from mask, which have no shape (z_flat.mask), it's link to this issue , i will add a litte patch

hgrosselindemann commented 2 years ago

Perfect, thank you!

hanyangliu1002 commented 2 years ago

Hi @dgwyther , I'm a master student in Taiwan : ) I'm curious how you calculated u and v and added these two parameters to the nc file. I recently used the simulated SWOT data. Like you, I need to use Unregulardataset to class the data, and the simulated SWOT data does not have the two parameters u and v.

dgwyther commented 2 years ago

Hello! I'm not sure I follow exactly, but u and v velocities were output from a ROMS simulation. I wrote the u and v fields to a netcdf file, but they could be equally written to a netcdf in python (for e.g.).

On Tue, 27 Sept 2022 at 13:17, hanyangliu1002 @.***> wrote:

Hi @dgwyther https://github.com/dgwyther , I'm curious how you calculated u and v and added these two parameters to the nc file. I recently used the simulated SWOT data. Like you, I need to use Unregulardataset to class the data, and the simulated SWOT data does not have the two parameters u and v.

— Reply to this email directly, view it on GitHub https://github.com/AntSimi/py-eddy-tracker/issues/145#issuecomment-1258913530, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABIZWM4XPCGS4GTRUA74763WAJRL3ANCNFSM5SRF7W5A . You are receiving this because you were mentioned.Message ID: @.***>

vtamsitt commented 2 years ago

Hi @AntSimi,

I'm applying PET to MITgcm output (the Biogeochemical Southern Ocean State Estimate, BSOSE), and the eddy identification is working just fine with UnRegularGrid but I'm wondering if you're still planning to add more methods to the UnRegularGrid class as mentioned above? I see the addition of the .display method was not merged. It would be really great to be able to use some more of the methods on the UnRegularGrid.

Thanks for a great product!

Veronica

I was previously more focus on RegularGrid that why lot of method miss in UnRegularGrid class, I add display method for UnRegularGrid.

When you run identification you must take care at few parameters :

step (0 to infinite)=> height between 2 isolines (a low value will slow detection, a high value will miss eddy, typical value is something from 1 mm to 5 mm)

shape_error => higher value will allow shape with more sharp (Typical value is around 70%)

In area with great current like gulf stream, great current create a high gradient in height which could be a problem for detection with height, this problem could be avoid with a step of filtering.

AntSimi commented 2 years ago

I am open to continue this work if people(could be you) are availble to try it . Which method did you need?

acr604 commented 1 year ago

Hi, I'm trying to use the Eddyid function on a ROMS Model data which contains daily SLA data for 3 years in a 1/12 degree grid. But it is showing some error like this:

File "/home/user/.local/lib/python3.10/site-packages/py_eddy_tracker/dataset/grid.py", line 724, in eddy_identification levels = arange(z_min - z_min % step, z_max - z_max % step + 2 * step, step) ValueError: Maximum allowed size exceeded

Is it due to the presence of missing values in the data ?

AntSimi commented 1 year ago

yes missing value must be masked

acr604 commented 1 year ago

Yes. It resolved the issue. Thank you for the reply.

AntSimi / py-eddy-tracker

Applying PET to ocean model data (ROMS) #145