AntSimi / py-eddy-tracker

Eddy identification and tracking
https://py-eddy-tracker.readthedocs.io/en/latest/
GNU General Public License v3.0
126 stars 53 forks source link

Using EddyTracking to compare observations and model outputs #7

Closed annesophiefortin closed 4 years ago

annesophiefortin commented 4 years ago

I would like to compare eddies from observations and models by using the EddyTracking since it gives useful informations like the cost_values for the paired eddies.

However I noticed that, in the tracking.yaml file, TRACK_DURATION_MIN must be at least 3 in order to have Anticyclonic.nc (Cyclonic.nc) and the Anticyclonic_track_too_short.nc (Cyclonic_track_too_short.nc) files. If the it is less than 3, I would obtain only the correspondances and the untracked files. But I would like to compare only 2 files at a times.

Do you think it could be feasible to compare only two files and get the cost_values of the paired eddies or should I develop my own tracking for that?

image

I don't know if it may be related, but it seems you had a similar issue: https://github.com/AntSimi/py-eddy-tracker/issues/3 Also, as mentionned in that issue, in the wiki (https://github.com/AntSimi/py-eddy-tracker/wiki#how-do-i-create-my-own-tracking-recipe), from py_eddy_tracker.observations import EddiesObservations as Model should be from py_eddy_tracker.observations.observation import EddiesObservations as Model

AntSimi commented 4 years ago

Good try, but tracking was not think for that... You are right cost_association from tracking function will give an information to find paired observations.

In the example below i used a cost function based on contour overlap: 0 < (Polygons intersection).area / (Polygons union).area < 1

With a ratio > 30 %

overlap

Very little eddies in big ones will be not selected with this cost function

import pylook
from matplotlib import pyplot as plt
from py_eddy_tracker.observations.observation import EddiesObservations
from numpy import linspace, ones, where

def reverse_index(index, nb):
    m = ones(nb, dtype="bool")
    m[index] = False
    return where(m)[0]

a_filt = EddiesObservations.load_file("Anticyclonic_20200424.nc")
a_raw = EddiesObservations.load_file("Anticyclonic_20200424_raw.nc")

# Get indexs of close observation (based on overlap contour)
i, j, cost = a_filt.match(a_raw)
m = cost > 0.3
i, j = i[m], j[m]
# Get index not used
i_, j_ = reverse_index(i, len(a_filt)), reverse_index(j, len(a_raw))

# Subset
a_filt_junk = a_filt.index(i_)
a_raw_junk = a_raw.index(j_)
a_filt = a_filt.index(i)
a_raw = a_raw.index(j)

# Plot
ax = plt.subplot(111, projection="plat_carre")
ax.grid()
ax.set_xlim(0, 360)
ax.set_ylim(-80, 80)
kwargs_display = dict(lw=1, extern_only=True)
a_filt.display(ax, label="Anticyclonic unfiltered", color="r", **kwargs_display)
a_raw.display(ax, label="Anticyclonic", color="blue", **kwargs_display)

kwargs_display["lw"] = 0.25
a_filt_junk.display(
    ax, label="Anticyclonic unfiltered not associate", color="r", **kwargs_display
)
a_raw_junk.display(ax, label="Anticyclonic not associate", color="b", **kwargs_display)

ax.legend(loc="upper right")
plt.show()

If you want save observations in a file use "to_netcdf" method

annesophiefortin commented 4 years ago

Thank you for your useful proposition :)

However, since the eddies may be a bit displaced from obs to model, a criteria only on area is too restrictive for us (see figures).

I worked a bit from your solution and the tracking function in EddiesObservations to develop another solution that I will copy/paste here.

import cartopy.crs as ccrs
from matplotlib import pyplot as plt
from py_eddy_tracker.observations.observation import EddiesObservations
from py_eddy_tracker import generic
from numpy import linspace, ones, where, arange

root = 'data\\eddies-obs-model-tocompare\\'
AVISO_AC_file = root+'Anticyclonic_obs_20200424.nc'
GIOPS_AC_file = root+'Anticyclonic_20200424.nc'

def reverse_index(index, nb):
    m = ones(nb, dtype="bool")
    m[index] = False
    return where(m)[0]
AVISO_AC = EddiesObservations.load_file(AVISO_AC_file)
GIOPS_AC = EddiesObservations.load_file(GIOPS_AC_file)

i_aviso, i_giops, cost_mat = AVISO_AC.tracking(GIOPS_AC)

# Get index not used
i_aviso_junk, i_giops_junk = reverse_index(i_aviso, len(AVISO_AC)), reverse_index(i_giops, len(GIOPS_AC))

# Subset
AVISO_AC_junk = AVISO_AC.index(i_aviso_junk)
GIOPS_AC_junk = GIOPS_AC.index(i_giops_junk)
AVISO_AC = AVISO_AC.index(i_aviso)
GIOPS_AC = GIOPS_AC.index(i_giops)

# Plot
plt.figure(figsize=(15,10))
data_crs=ccrs.PlateCarree()
ax = plt.axes(projection=data_crs)
ax.coastlines()
ax.set_extent([-80,-45,30,50])
ax.set_aspect('auto')
ax.grid()
kwargs_display = dict(lw=1.25, extern_only=True)
AVISO_AC.display(ax, label="Anticyclonic AVISO", color="black",transform=data_crs, **kwargs_display)
GIOPS_AC.display(ax, label="Anticyclonic GIOPS", color="red",transform=data_crs, **kwargs_display)
kwargs_display["lw"] = 1
AVISO_AC_junk.display(ax, label="Anticyclonic AVISO not associated", color="navy",transform=data_crs, **kwargs_display)
GIOPS_AC_junk.display(ax, label="Anticyclonic GIOPS not associated", color="darkorange",transform=data_crs, **kwargs_display)
ax.legend(loc="upper left")
ax.set(title='Association based on EddiesObservations\' Methods')
plt.show()

image

image

AntSimi commented 4 years ago

Nice solution. I understand your problem, and i understand why overlap are not enough to you. But i am not confortable with all association provide by default tracking.

AntSimi commented 4 years ago

Maybe you could use extern contour to increase match, you could also reduce accepted score. Look at match function

annesophiefortin commented 4 years ago

Indeed, some associations from the default tracking seems wrong in the above figure. (But the association also seems wrong if I reduce the ratio of overlap.)

As for the match function, it would indeed increase the associations, and may be a good solution if we look at large radius eddies. Small eddies won't be properly associated as they may be displaced in GIOPS (their contour may not intersect).

So I think I will work a bit more on the default tracking. Here is what I understand from it : The default tracking has a threshold for the distance between eddies (max 125 km apart), it doesn't have a threshold for the other parameters. It just match the two eddies with the lowest cost value of association that are in the 125 km range (so we can have faulty associations).

I was thinking to put a threshold on the cost values to reduce the faulty association.

image

annesophiefortin commented 4 years ago

When the cost < 2 (~mean+1std)

image

AntSimi commented 4 years ago

Another things. I talked with a colleague today, and she suggested to also search match between cyclonic from on set to anticyclonic of the other set. If there are to much match, maybe dataset will be hard to compare!

Le mer. 27 mai 2020 à 17:19, annesophiefortin notifications@github.com a écrit :

When the cost < 2

[image: image] https://user-images.githubusercontent.com/64930132/83039147-c8abad00-a00b-11ea-83f4-515bb989ee5f.png

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/AntSimi/py-eddy-tracker/issues/7#issuecomment-634734004, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIS7AZPYXVY5O63FM6K4UYTRTUVQBANCNFSM4NG5MSYQ .

annesophiefortin commented 4 years ago

Let me know if I am wrong: I don't think it is too bad. It about half of the number of eddies identified previously.

When the cost < 2 (~mean+1std): image

image

AntSimi commented 4 years ago

Could you remember which field did you for aviso (h, u, v) and for giops, I just look quickly your figure, I am not sure it s a good things that 1/4 of observations are associate with opposite sign did you found same quantity when you do cyclonic aviso vs anticyclonic giops ?

annesophiefortin commented 4 years ago

I am not sure I understand your comment. Are you asking about the sla/adt, ugos, vgos? GIOPS doesn't output ugos, vgos variables so we compute it from the add_uv.

As for the figure below, the number of eddies are about the same as above. The eddies that are match are about the same size and about at the same location such that the cost is low.

image

AntSimi commented 4 years ago

Did you use same field in the two case, sla or adt(preferred) if it's not the case it could increase mismatch between cyclonic and anticyclonic

AntSimi commented 4 years ago

I have no advice to set threshold value on this cost function, because i used rarely this one. I am a little bit supprise on your first figure to see few AVISO anticyclonic at exactly same place than GIOPS Cyclonic (i see 3 cases in the figure). I understand better when it's just a minor overlap. Maybe for this exercice (A vs C), it must be better to used contour overlap match, in order to reject very small overlap.

Let me know if I am wrong: I don't think it is too bad. It about half of the number of eddies identified previously.

When the cost < 2 (~mean+1std): image

annesophiefortin commented 4 years ago

It is both sla (will change to adt for my analysis). I use the ugos, vgos of the AVISO file and the add_uv for GIOPS. As GIOPS is interpolated from its native grid to AVISO grid, I think some eddies may be deformed.

Did you use same field in the two case, sla or adt(preferred) if it's not the case it could increase mismatch between cyclonic and anticyclonic

CoriPegliasco commented 4 years ago

Hi :)

AntSimi showed me your issue. I wrote a paper comparing eddies detected in the SLA and the ADT in the Mediterranean Sea, with a different detection and tracking algorithm. You can check it (https://doi.org/10.1016/j.asr.2020.03.039)

So, my vision is oriented. I compare eddies from dataset A and B with overlap consideration, looking first at the intersection (existing or not) between eddies in A and B, and after at the ratio between the overlapping area and the area of the eddy in A. Then I do the same but from dataset B to A, to check the ratio between the overlapping area and the area in B. I look at anticyclones an cyclones simultaneously.

Thus no intersection = absence in a dataset, an intersection with same rotation with small overlapping area = shift, same rotation and large overlapping area = good agreement, intersection with a different rotation = something is wrong With the last closed contour, I had to deal with multiple intersections (all intersections with different rotations = wrong, at least one intersection with same kind = good)

I'm not convinced by the matching eddies which are not overlapping as with the default tracking, for me without overlap it's a different structure, geographical proximity is not enough. If you want put an overlapping threshold, I suggest you to compute the ratio between the area of intersection and the min area between the two eddies, the little eddies included in bigger ones will have high values.

But it depends on what is important for your evaluation : the distance, the overlap, the common area, the similarity in characteristics even the eddies are (more or less) shifted...

Did you filter the eddies in amplitude, radius or lifetime? It seems that you have few small eddies.

annesophiefortin commented 4 years ago

Hi Dr. Pegliasco, I would like to thank you for your reply. It makes sense that you chose overlapping eddy to answer your question as you are comparing two fields from observations. However, since we are comparing observation to model outputs, we think we should permit the pairing of eddies that do not overlap. Thus, we will give the basic tracking a chance for now. To answers your question, we did not filtered the eddies in amplitude, radius or lifetime (we will probably do that later). For now we are comparing each day of model and obs. Also, we kept the pixel limit to the default and the resolution of obs and model is 0.25° lat lon.

annesophiefortin commented 4 years ago

Hi Dr. Delepoulle, I just wanted to let you know that the match method identifies twice an eddy if it overlap (match) with 2 eddies.

AntSimi commented 4 years ago

Hi Dr. Delepoulle, I just wanted to let you know that the match method identifies twice an eddy if it overlap (match) with 2 eddies.

You are right, you could get twice match or more with this method, but it's wanted. Because this case could be highlight interaction process. If you want unique match, you need to apply post processings.