AntSimi / py-eddy-tracker

Eddy identification and tracking
https://py-eddy-tracker.readthedocs.io/en/latest/
GNU General Public License v3.0
132 stars 55 forks source link

Problem with Julian calendar in detection/tracking #206

Open marimpacheco opened 1 year ago

marimpacheco commented 1 year ago

Hi,

I am using py_eddy_tracker for a model data (GFDL CM2-O) which uses Julian calendar and has years ranging from 181 to 190 (5-day means). Summary: when I use datetime function, the resulting time in the eddy files are wrong. But cannot do the detection and tracking when I use cft.DatetimeJulian to define my date. Question: Just giving the sorted list of files to the tracking functions (Correspondances) ensures the correct tracking?

Previously, for the detection, I was doing something like:

month_days = pd.date_range(start='1/3/2001', end='12/29/2001', freq='5D') #array with my annual 73 day/month values

def detection(year, n, run=1, version_res=6):
    data = get_path(run, version_res, year, n) #getting data path
    g = UnRegularGridDataset(data, "geolon_t", "geolat_t", centered=True, indexs = dict(time=0))

    date = datetime(year, month_days.month[n-1], month_days.day[n-1]) # detect each timestep individually because of memory issues

    g.high_filter('SSH', 700)

    a, c = g.eddy_identification("SSH", "u", "v", date,  
    0.004,  
    pixel_limit=None,  
    shape_error=70,  
    )

for i in np.arange(1,74):
    detection(181,i, run=1)

However, when I load the eddy data, all the eddies, in different files of different years, have the time variable wrong (and same values):

[79]: ed['time'][0].values
[79]: numpy.datetime64('2018-01-19T03:14:08.000000000')

Before noticing that, I had conducted the tracking without problems. But I was wondering if the wrong time in the resulting eddy netcdf files would lead to wrong tracks (I gave the sorted list of files to the tracking functions). I plotted some tracking plots and it seemed reasonable, but I'd like to double-check.

I had tried to change the time variable in the resulting eddy files, with something like this:

ed['time'].values = np.full(2893, cft.DatetimeJulian(181, month_days.month[0], month_days.day[0]))

But got this error when I tried the tracking:

---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
/scratch/SlurmTMP/smomw541.8378967/ipykernel_3894634/3511661696.py in <module>
----> 1 c.track()
      2 c.prepare_merging()
      3 
      4 eddies_area_tracker = c.merge(raw_data=False)
      5 eddies_area_tracker.virtual[:] = eddies_area_tracker.time == 0

~/miniconda3/envs/py3_eddy/lib/python3.10/site-packages/pyEddyTracker-3.6.1+3.g7fc19df-py3.10.egg/py_eddy_tracker/tracking.py in track(self)
    372         if needed_variable is not None:
    373             kwargs["include_vars"] = needed_variable
--> 374         self.swap_dataset(self.datasets[first_dataset - 1], **kwargs)
    375         # We begin with second file, first one is in previous
    376         for file_name in self.datasets[first_dataset:]:

~/miniconda3/envs/py3_eddy/lib/python3.10/site-packages/pyEddyTracker-3.6.1+3.g7fc19df-py3.10.egg/py_eddy_tracker/tracking.py in swap_dataset(self, dataset, *args, **kwargs)
    178                 self.current_obs = self.class_method.load_file(h, *args, **kwargs)
    179         else:
--> 180             self.current_obs = self.class_method.load_file(dataset, *args, **kwargs)
    181 
    182     def merge_correspondance(self, other):

~/miniconda3/envs/py3_eddy/lib/python3.10/site-packages/pyEddyTracker-3.6.1+3.g7fc19df-py3.10.egg/py_eddy_tracker/observations/observation.py in load_file(cls, filename, **kwargs)
    778             return cls.load_from_zarr(filename, **kwargs)
    779         else:
--> 780             return cls.load_from_netcdf(filename, **kwargs)
    781 
    782     @classmethod

~/miniconda3/envs/py3_eddy/lib/python3.10/site-packages/pyEddyTracker-3.6.1+3.g7fc19df-py3.10.egg/py_eddy_tracker/observations/observation.py in load_from_netcdf(cls, filename, raw_data, remove_vars, include_vars, indexs, **class_kwargs)
    956         else:
    957             args, kwargs = (filename,), dict()
--> 958         with Dataset(*args, **kwargs) as h_nc:
    959             _check_versions(getattr(h_nc, "framework_version", None))
    960 

src/netCDF4/_netCDF4.pyx in netCDF4._netCDF4.Dataset.__init__()

src/netCDF4/_netCDF4.pyx in netCDF4._netCDF4._ensure_nc_success()

OSError: [Errno -51] NetCDF: Unknown file format: b'/'

I had also tried to run the detection again, changing the 'date' line in the first example with:

date = cft.DatetimeJulian(year, month_days.month[n-1], month_days.day[n-1])

But got:

Exception: Date argument must be a datetime object

Thanks for your time!

AntSimi commented 1 year ago

You must cast date in datetime python format : https://docs.python.org/fr/3/library/datetime.html#datetime-objects