OpenDrift / trajan

Trajectory analysis package for simulated and observed trajectories
https://opendrift.github.io/trajan/
GNU General Public License v2.0
11 stars 5 forks source link

trajan breaks for first parcels tutorial example (because of non-geographic coordinates?) #47

Closed erikvansebille closed 1 year ago

erikvansebille commented 1 year ago

First of all, thanks for developing trajan, I think this could become a great resource for the Lagrangian Ocean Analysis community!

As you may have seen in https://github.com/OceanParcels/parcels/issues/1319, I'd be quite interested to ditch all the plotting routines (which are rudimentary and a hassle to maintain) for trajan.

However, as I was starting to explore trajan, I immediately ran into a problem with the very first Parcels tutorial.

Below is a minimal example, for which you'll need to download the three netcdf file at https://oceanparcels.org/examples-data/MovingEddies_data/ and place them in the same directory where the script below is located

from parcels import FieldSet, ParticleSet, JITParticle, AdvectionRK4
import xarray as xr
import trajan as ta  # noqa
from datetime import timedelta

fieldset = FieldSet.from_parcels("moving_eddies")

pset = ParticleSet.from_list(fieldset=fieldset, pclass=JITParticle, lon=[3.3e5,  3.3e5], lat=[1e5, 2.8e5])

output_file = pset.ParticleFile(name="EddyParticles.zarr", outputdt=timedelta(hours=1))
pset.execute(AdvectionRK4, runtime=timedelta(days=6), dt=timedelta(minutes=5), output_file=output_file)

ds = xr.open_dataset('EddyParticles.zarr', engine='zarr')
ds.traj.plot()

Gives an error

Traceback (most recent call last):
  File "test_trajan.py", line 23, in <module>
    ds.traj.plot()
  File "/opt/anaconda3/envs/py3_parcels/lib/python3.8/site-packages/trajan/plot/__init__.py", line 128, in __call__
    return self.lines(*args, **kwargs)
  File "/opt/anaconda3/envs/py3_parcels/lib/python3.8/site-packages/trajan/plot/__init__.py", line 145, in lines
    ax = self.set_up_map(kwargs)
  File "/opt/anaconda3/envs/py3_parcels/lib/python3.8/site-packages/trajan/plot/__init__.py", line 100, in set_up_map
    fig = plt.figure(figsize=(figsize, figsize * aspect_ratio))
  File "/opt/anaconda3/envs/py3_parcels/lib/python3.8/site-packages/matplotlib/pyplot.py", line 787, in figure
    manager = new_figure_manager(
  File "/opt/anaconda3/envs/py3_parcels/lib/python3.8/site-packages/matplotlib/pyplot.py", line 306, in new_figure_manager
    return _backend_mod.new_figure_manager(*args, **kwargs)
  File "/opt/anaconda3/envs/py3_parcels/lib/python3.8/site-packages/matplotlib/backend_bases.py", line 3493, in new_figure_manager
    fig = fig_cls(*args, **kwargs)
  File "/opt/anaconda3/envs/py3_parcels/lib/python3.8/site-packages/matplotlib/figure.py", line 2280, in __init__
    raise ValueError('figure size must be positive finite not '
ValueError: figure size must be positive finite not (11, <xarray.DataArray 'lat' ()>
array(-29.37675309))

Now, this could be because the particles in this particular case are not on a geographic (i.e. longitude and latitude in degrees) grid, but in a x-y grid in kilometers.

If so, it might be that support for this type of this is beyond scope for trajan? But even then, the error message might be a bit more informative? ;-)

knutfrode commented 1 year ago

Hi, and thank you for the interest in TrajAn and for the bug report. As recently commented in https://github.com/OceanParcels/parcels/issues/1319, we still have some work to do on improving the robustness and error messages. I have not been able to repeat your example yet, but a quick look might indicate that this is due to a bug allowing for a negative figure aspect ratio. So it might be that this is fixed after this commit made a few minutes ago: https://github.com/OpenDrift/trajan/commit/5d1783a4e5c23ed31f4ecab5ffac81ca5713414b

erikvansebille commented 1 year ago

Thanks for the update and trying to fix this, @knutfrode. I ran the code snippet above with the latest TrajAn on master, and now get a different error

Traceback (most recent call last):
  File "/Users/erik/opt/anaconda3/envs/py3_parcels/lib/python3.8/site-packages/cartopy/mpl/geoaxes.py", line 865, in set_extent
    x1, y1, x2, y2 = projected.bounds
ValueError: not enough values to unpack (expected 4, got 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "test_parcels.py", line 14, in <module>
    ds.traj.plot()
  File "/Users/erik/Codes/trajan/trajan/plot/__init__.py", line 129, in __call__
    return self.lines(*args, **kwargs)
  File "/Users/erik/Codes/trajan/trajan/plot/__init__.py", line 146, in lines
    ax = self.set_up_map(kwargs)
  File "/Users/erik/Codes/trajan/trajan/plot/__init__.py", line 117, in set_up_map
    ax.set_extent([lonmin, lonmax, latmin, latmax], crs=self.gcrs)
  File "/Users/erik/opt/anaconda3/envs/py3_parcels/lib/python3.8/site-packages/cartopy/mpl/geoaxes.py", line 867, in set_extent
    raise ValueError(
ValueError: Failed to determine the required bounds in projection coordinates. Check that the values provided are within the valid range (x_limits=(-20037508.342789244, 20037508.342789244), y_limits=(-15496570.739723718, 18764656.231380563)).

This suggests that indeed TrajAn expects trajectory coordinates that are in geographical coordinates (lon/lat in degrees).

It could be a logical design decision to keep this scope; although we also use Parcels on much smaller scale simulations where coordinates are in kilometers or meters; and others use it even on different planets...

gauteh commented 1 year ago

I have added the parcels tutorial as a (failing) test to trajan so that we can figure out how to make this work. Very interesting discussion about what to do with different coordinate reference systems! Does the CF-standard support that information? We need to communicate the CRS in a standard way to trajan, and also possibly add it to the CF-standard.

https://github.com/OpenDrift/trajan/pull/49

knutfrode commented 1 year ago

Hi,

The CF-standard supports different CRS, but I am sure if the specific sub-convention on trajectory data allows this. At least I have found no "official" examples of this.

On Sat, Feb 25, 2023, 10:10 Gaute Hope @.***> wrote:

I have added the parcels tutorial as a (failing) test to trajan so that we can figure out how to make this work. Very interesting discussion about what to do with different coordinate reference systems! Does the CF-standard support that information? We need to communicate the CRS in a standard way to trajan, and also possibly add it to the CF-standard.

49 https://github.com/OpenDrift/trajan/pull/49

— Reply to this email directly, view it on GitHub https://github.com/OpenDrift/trajan/issues/47#issuecomment-1445036891, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABH25I6NDGTU3XO3WZTICPTWZHD7TANCNFSM6AAAAAAVHBG5XM . You are receiving this because you were mentioned.Message ID: @.***>

erikvansebille commented 1 year ago

I had a look at https://cfconventions.org/cf-conventions/cf-conventions.html#appendix-grid-mappings, but all examples there are still geographic in the sense that they assume a coordinate is located on a sphere (the earth?). For many small-scale/lab applications, this assumption of a geographic projection does not make sense. In Parcels we refer to this as mesh='flat'.

I saw that some of the projections on https://cfconventions.org/cf-conventions/cf-conventions.html#grid-mappings-and-projections have easting and northing in units of meters (e.g. the British National Grid in example 5.10); but using this type of CF grid for flat meshes is a bit hacky (e.g. land boundaries may show up where they are not expected by users).

So I guess the questions is: do you want to also support simple Cartesian coordinate systems in e.g. meters that are non-CF-compliant in TrajAn?

knutfrode commented 1 year ago

Trajan is not supposed to support gridded files (typically forcing files for trajectory models), but only trajectory output files, thus this is the relevant specification: http://cfconventions.org/Data/cf-conventions/cf-conventions-1.10/cf-conventions.html#trajectory-data

All these examples are with positions as lon and lat. It might be possible to also use any coordinate system also here, but not sure how then to specify this correctly.

Anyway, I think there is noting wrong with TrajAn supporting a superset of these conventions, e.g. positions as x,y in other coordinate systems. OpenDrift does always store particle positions in lon, lat, although input/forcing files can have any projection. However, there is one example in the gallery where I used a 'coordinate system at (0, 0) and small dimension to "fake" a cartesian coordinate system for the particles: https://opendrift.github.io/gallery/example_double_gyre.html#sphx-glr-gallery-example-double-gyre-py Here it would of course be a more proper solution to use a cartesian coordinate system also for the particles.

gauteh commented 1 year ago

Supporting arbitrary coordinate systems at this point might take us into the realm of overengineering. However, it would be stupid to lock ourselves out of it. Especially since lon,lat may not be accurate (or appropriate) enough for some important scenarios.

For TrajAn to work properly it needs to be a projection that works with cartopy. I think the CF standard grid-mapping only supports a few projections. But as long as we know the projection and the cartopy CRS we should be ok.

gauteh commented 1 year ago

I am now generating the fieldset in trajan tests (https://github.com/OpenDrift/trajan/pull/49/files#diff-8730d4770f5bb466d36dcff29ed31e3f2c65046973eeaf4c1aaab54b1b4df6d2R7) and running it as a test: https://github.com/OpenDrift/trajan/pull/49/files#diff-545f0ef77cf87cbd2a86558ce49ed79f99875ae04f936fbf2801f8eb35b8700fR9.

For this file there does not seem to be any projection information included. This is probably also a case where "unprojected" would be the correct way to plot it, just don't use cartopy at all. But since lon and lat is present it would be difficult for trajan to know how to interpret it.

I think there could be a set_projection method on ds.traj which sets the attributes on the dataset to whatever is specified. xarray solves this by supplying transform= to each plot method. I don't think that will be very convenient for trajan since many other methods also need to know the projection (or lack thereof). rio-xarray does CRS detection for most common CF defined projections, and I think it can even find the correct cartopy CRS for it.

erikvansebille commented 1 year ago

Would it help if trajectories with a non-geographic projection don't use longitude and latitude as variables, but e.g. x and y? Such a feature would be relatively easy to develop for Parcels. Then, TrajAn could base its choice to use carroty or not based on the existence of either longitude or x (although we'll have to consider the edge case where both are in a file?)

gauteh commented 1 year ago

Yes, that is what I was thinking as well. A missing grid-mapping (or equivalent) + coordinates with named x y means the data is unprojected. For trajectory data it should be easier to deal with the cases where both are present since lon and lat won't have to be in 2D: lat(x, y), and they should equivalent (modulo numerical accuracy).

gauteh commented 1 year ago

I realized that I don't actually need the cartopy projection of the data, we can transform it to lat/lon when plotting it on a map. If data is cartesian/unprojected we plot it on regular axes.

gauteh commented 1 year ago

In https://github.com/OpenDrift/trajan/pull/49 parcels output is converted to x,y and plotted on regular axes, I am not sure if this setup is the best, but it is ready to be tried out.

erikvansebille commented 1 year ago

Thanks @gauteh, for this great support!

I've tried with the latest code (note that I had to change Cache to lru_cache(maxsize=None) because I'm still running with oython3.8; see this stackoverflow discussion, and got the following error

INFO: Compiled ArrayJITParticleAdvectionRK4 ==> /var/folders/r2/8593q8z93kd7t4j9kbb_f7p00000gr/T/parcels-504/libb5ec199e922de600f73556e81dcf039c_0.so
Renaming geographic ('lon', 'lat') coordinates to x, y..
<xarray.Dataset>
Dimensions:     (obs: 145, trajectory: 2)
Coordinates:
  * obs         (obs) int32 0 1 2 3 4 5 6 7 ... 137 138 139 140 141 142 143 144
  * trajectory  (trajectory) int64 0 1
Data variables:
    time        (trajectory, obs) timedelta64[ns] ...
    z           (trajectory, obs) float32 ...
    x           (trajectory, obs) float32 ...
    y           (trajectory, obs) float32 ...
Attributes:
    Conventions:            CF-1.6/CF-1.7
    feature_type:           trajectory
    ncei_template_version:  NCEI_NetCDF_Trajectory_Template_v2.0
    parcels_mesh:           flat
    parcels_version:        v2.4.1-8-g1b46d95f
Traceback (most recent call last):
  File "/Users/erik/anaconda3/envs/py3_parcels/lib/python3.8/site-packages/cf_xarray/accessor.py", line 568, in _getattr
    attribute: Mapping | Callable = getattr(obj, attr)
  File "/Users/erik/anaconda3/envs/py3_parcels/lib/python3.8/site-packages/xarray/core/common.py", line 246, in __getattr__
    raise AttributeError(
AttributeError: 'Dataset' object has no attribute 'grid_mapping_names'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "test_parcels.py", line 18, in <module>
    ds.traj.plot()
  File "/Users/erik/Codes/trajan/trajan/plot/__init__.py", line 137, in __call__
    return self.lines(*args, **kwargs)
  File "/Users/erik/Codes/trajan/trajan/plot/__init__.py", line 154, in lines
    ax = self.set_up_map(kwargs)
  File "/Users/erik/Codes/trajan/trajan/plot/__init__.py", line 60, in set_up_map
    if self.__cartesian__:
  File "/Users/erik/Codes/trajan/trajan/plot/__init__.py", line 28, in __cartesian__
    return self.ds.traj.crs is None
  File "/Users/erik/Codes/trajan/trajan/trajectory_accessor.py", line 65, in __getattr__
    return getattr(self.inner, attr)
  File "/Users/erik/Codes/trajan/trajan/traj.py", line 140, in crs
    if len(self.ds.cf.grid_mapping_names) == 0:
  File "/Users/erik/anaconda3/envs/py3_parcels/lib/python3.8/site-packages/cf_xarray/accessor.py", line 1292, in __getattr__
    return _getattr(
  File "/Users/erik/anaconda3/envs/py3_parcels/lib/python3.8/site-packages/cf_xarray/accessor.py", line 578, in _getattr
    raise AttributeError(
AttributeError: 'grid_mapping_names' is not a valid attribute on the underlying xarray object.

Any idea what's going on?

gauteh commented 1 year ago

What version of cf_xarray do you have? As far as I understand the docs grid_mapping_names should always be present, but it might work differently on other versions. Attached a list of the deps I am running with.

tenv.txt

gauteh commented 1 year ago

@erikvansebille: the parcels conda package requires python >=3.9, <=3.10 something. But it seems to work with python 3.11 when installed via pip+git. Can it also run on 3.11 via its conda package?

erikvansebille commented 1 year ago

@erikvansebille: the parcels conda package requires python >=3.9, <=3.10 something. But it seems to work with python 3.11 when installed via pip+git. Can it also run on 3.11 via its conda package?

I just created a new condo package for Parcels this morning (for v2.4.1), but that was indeed only built for python 3.8, 3.9 and 3.10. I'm not sure how to also trigger a build for 3.11; do you know how to do that? Do I need to change something in the recipe/meta.yaml?

erikvansebille commented 1 year ago

What version of cf_xarray do you have?

Version 0.7.5; which is older than the 0.8.0 in your tenv.

I'll try updating python to something more recent than 3.8. Maybe that helps?

gauteh commented 1 year ago

Yup, seems to work for @knutfrode as well. Not sure about 3.11 on conda-forge, numpy-feedstock has 3.11 in .azure-pipelines, while parcels does not. Maybe you need to re-render or make some changes to conda-forge.yml. Not that proficient in conda-feedstocks so maybe best to just ping one of the conda-teams.

erikvansebille commented 1 year ago

Seems to have been fixed by a bot on conda-forge, in https://github.com/conda-forge/parcels-feedstock/pull/93. There should now (soon?) also be a python3.11 version of parcels at https://anaconda.org/conda-forge/parcels

erikvansebille commented 1 year ago

I updated to cf_xarray version 0.8.0, and the code in https://github.com/OpenDrift/trajan/pull/49/files now works!

gauteh commented 1 year ago

Great!

I slightly modified the set_crs method to return a new modified dataset, rather than modify the existing one. This is more in line with how xarray does thing (e.g. dataset.drop_vars).

Closing this issue, feel free to open new issues if you run into things.

Trajan probably works best on geographic coordinates.