hainegroup / oceanspy

A Python package to facilitate ocean model data analysis and visualization.
https://oceanspy.readthedocs.io
MIT License
100 stars 32 forks source link

pin xgcm < 0.7 #243

Closed Mikejmnez closed 1 year ago

Mikejmnez commented 2 years ago

Description

There have been major updates to xgcm , regarding generalized ufuncs, resulting in some changes to the way grid interpolates and calculates derivatives. These shouldn't affect oceanspy at the fundamental level, but have been causing some trouble.

For a better description on the changes to xgcm see: https://github.com/xgcm/xgcm/issues/344

These new changes involve refactoring, and are somewhat related to #224.

Changes to xgcm have been implemented in versions >= 0.7. You can see the description of new features here: https://xgcm.readthedocs.io/en/latest/whats-new.html

As a result of the changes, the following code does not work after performing the cutout (the cutout works well, as you can make plots which trigger the calculation):

grid.interp(cut_od._ds['Depth'], axis='X', boundary='extrapolate')

which yields the error

NotImplementedError: Cannot chunk along a core dimension for a grid ufunc which has a signature which includes one of the axis positions ['inner', 'outer']

Also, beginning with version 0.7 the grid.interp argument

boundary='extrapolate'

no longer appears as an option on the description/documentation and there was some discussion to remove extrapolate as an option completely, although for now (as of version 0.8), it is still possible to continue using it.

Quick fix

Pin xgcm to version 0.6 while I work on figuring out the best fix to this. There is strong possibility that the error only affects the grid post-cutout, and so the fix will be local to the code where I update the grid object with the new (subsampled) dataset. I just need to some to figure it out.

Mikejmnez commented 2 years ago

Will keep this open, as pining xgcm is only a quick fix.

Mikejmnez commented 2 years ago

The new version of xarray also contains some internal refactoring changes that changes some aspects of indexing. This new version of xarray was released last week, and so it didn't cause any issue on the test image that me and Tom ran in Sciserver.

More on xarray changes released last week are here

https://docs.xarray.dev/en/stable/whats-new.html#v2022-06-0-july-21-2022

Using the new release of xarray (version 2022.6) did cause an error in Sciserver in

dmaskT = maskT.where(maskT, drop=True)

before the cutout. I need some time to fully understand and fix these incompatibility issues (should be small, but subtle). This all is related to Ryan's refactoring proposals in #224. Thus, a fix to this current issue should also fix #224.

ThomasHaine commented 1 year ago

Explore this issue when #224 is addressed....

MaceKuailv commented 1 year ago

The survey station is also broken by changes in xgcm, with a similar message.

Mikejmnez commented 1 year ago

Makes sense, as there is some interpolation happening there.

MaceKuailv commented 1 year ago

I don't think we should change anything here, although it apparently breaks a lot of things. Oceanspy is only trying to do the most basic things xgcm promises, interpolate velocity to the center point. If the velocity dataset is chunked, then you would have this issue.

This is a known bug they have mentioned many many times.

https://github.com/xgcm/xgcm/issues/522 https://github.com/xgcm/xgcm/issues/533 https://github.com/xgcm/xgcm/issues/518

I have adapted their tutorial just a little bit to reproduce this bug.

from xgcm import Grid
ds = xr.Dataset(
        coords={
            "x_c": (
                ["x_c"],
                np.arange(1, 10),
            ),
            "x_g": (
                ["x_g"],
                np.arange(1.5, 9),
            ),
        }
    )
grid = Grid(ds, coords={"X": {"center": "x_c", "inner": "x_g"}})
da = np.sin(ds.x_c * 2 * np.pi / 9)
da = da.chunk((9)) # if it is not chunked, then no error

da_interp = grid.diff(da, axis="X") # you get the same error here
Mikejmnez commented 1 year ago

Thanks for looking into this. It is re assuring that we don't need to make changes regarding xgcm. It wasn't clear to me whether that was true or not back in July when the release happened just before updating the image.

So it seems that keep pinning xgcm is the right way to do this, and wait until it gets fixed on their end. Have you had time to check on the issue regarding xarray? An error during a cutout cause some trouble, after the following line:

dmaskT = maskT.where(maskT, drop=True)

I don't remember the actual example where this happened, but I am guessing in one of the example notebooks (not LLC specific).

MaceKuailv commented 1 year ago

I am having a bit of a hard time finding this issue. I am using newer version of xarray to cutout LLC datasets now. It seems fine to me

MaceKuailv commented 1 year ago

I tried a similar thing here: image It seems fine to me...

ThomasHaine commented 1 year ago

So, can we unpin xarray? Or @Mikejmnez can you reproduce the error?

Mikejmnez commented 1 year ago

I'll look into this later on Monday.

Sent from my T-Mobile 4G LTE Device Get Outlook for Androidhttps://aka.ms/AAb9ysg


From: ThomasHaine @.> Sent: Saturday, October 22, 2022 5:12:35 PM To: hainegroup/oceanspy @.> Cc: Miguel Jimenez-Urias @.>; Mention @.> Subject: Re: [hainegroup/oceanspy] pin xgcm < 0.7 and xarray < 2022.6 (Issue #243)

  External Email - Use Caution

So, can we unpin xarray? Or @Mikejmnezhttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FMikejmnez&data=05%7C01%7Cmjimen17%40jhu.edu%7Ceb4d25d40b654de80de208dab47224f8%7C9fa4f438b1e6473b803f86f8aedf0dec%7C0%7C0%7C638020699599303291%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=r38tdrAkjBUS2pIIGw5GYS3pjsoQbOz7B5wFNNL4zWk%3D&reserved=0 can you reproduce the error?

— Reply to this email directly, view it on GitHubhttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhainegroup%2Foceanspy%2Fissues%2F243%23issuecomment-1287921984&data=05%7C01%7Cmjimen17%40jhu.edu%7Ceb4d25d40b654de80de208dab47224f8%7C9fa4f438b1e6473b803f86f8aedf0dec%7C0%7C0%7C638020699599459540%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=a%2BaqgvzEdYeR3TuhjU%2Foy%2BhdBcXSJvNj4kMv8SYyyDU%3D&reserved=0, or unsubscribehttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAB64CSNXCRSBKYJCFO3HXDTWERKEHANCNFSM54LWQCOQ&data=05%7C01%7Cmjimen17%40jhu.edu%7Ceb4d25d40b654de80de208dab47224f8%7C9fa4f438b1e6473b803f86f8aedf0dec%7C0%7C0%7C638020699599459540%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=SozhxL59MJcHIoGRbbvIiqMGeEh9Ohrecq0%2BC6SGW4s%3D&reserved=0. You are receiving this because you were mentioned.Message ID: @.***>

Mikejmnez commented 1 year ago

PR #270 unpins xarray and all tests pass

Mikejmnez commented 1 year ago

It looks like they have mostly solve this issue in the latest xgcm version. I ran the example above by @MaceKuailv

from xgcm import Grid
ds = xr.Dataset(
        coords={
            "x_c": (
                ["x_c"],
                np.arange(1, 10),
            ),
            "x_g": (
                ["x_g"],
                np.arange(1.5, 9),
            ),
        }
    )
grid = Grid(ds, coords={"X": {"center": "x_c", "inner": "x_g"}})
da = np.sin(ds.x_c * 2 * np.pi / 9)
da = da.chunk((9)) # if it is not chunked, then no error

da_interp = grid.diff(da, axis="X") 

and runs fine. This issue https://github.com/xgcm/xgcm/issues/522 explains their approach, which is basically that single chunking along a core dimension works (it had stopped working). Multiple or arbitrary chunks along a core dimension yields the same error. They have updated the error to suggest rechunking to a single chunk along the dimension if possible.

I will again update the binder shortly.