MITgcm / xmitgcm

Read MITgcm mds binary files into xarray
http://xmitgcm.readthedocs.io
MIT License
57 stars 66 forks source link

How to create regular grid nc files? #221

Open Miaoxiangying opened 4 years ago

Miaoxiangying commented 4 years ago

The xmitgcm is very useful for learning about oceans, thanks for your work. But as a beginner, there are several problems I can't figure out and I'm looking for some help.

I want to make the input files of the SWOT Simulator from ECCO-LLC4320 data, that is, create some regional or global sea levels with regular latlon grid, and done with the file formate of '*.nc'.

1. ECCO DATA PORTAL file formats I want to download the ETA files from ECCO PORTAL and process them by the xmitgcm. I found that there are '.data' files in the path of regions-daily_average and I can read it as follows. data_dir = "Z:\\MITgcm llc4320\\eta" grid_dir = 'Z:\\MITgcm llc4320\\grid' prefix = ["Eta"] index = [13680] ds = open_mdsdataset(data_dir, grid_dir, iters=index, prefix=prefix, read_grid=True, delta_t=1, ref_date=None, calendar='gregorian', geometry='llc', grid_vars_to_coords=True, swap_dims=None, endian='>', chunks=None, ignore_unknown_vars=False, default_dtype=None, nx=None, ny=None, nz=None, llc_method='bigchunks', extra_metadata=None) However, the format of the hourly data file is '.shrunk'. _Can these two formats be handled by the 'openmdsdataset' in the same way? It seems that '*.shrunk' needs the mask file from https://storage.googleapis.com/pangeo-ecco/llc/masks/llc_4320_masks.zarr/, but I can't download it and don't know how to use it in the function.

2. Irregular latlon grids The LLC grid can be transformed into the latlon grid by the 'faces_dataset_to_latlon' function, ds_latlon = lm.faces_dataset_to_latlon(ds,metric_vector_pairs=[]) but the coordinates of latitude and longitude neither uniform nor regular. The number in the meridional is 12960, which equals to 270° × 1°/48. The number in the zonal is 17280, but it is uneven in different lines and started from 115°W. I understand that this is because the special LLC grid, but it's inconvenient for me. How can I get the regional or global regular laolon grid and data? Can I do this by interpolation such as 'griddata' or 'interp2'?

timothyas commented 4 years ago

Hey there, I would recommend using the llcreader capabilities to access the LLC4320, since (I think) it was essentially designed for this purpose. This eliminates the need to "download" the LLC4320 data, but rather you can use it to get a dataset within a jupyter notebook, and work with it as you need. You could then subsample the dataset, and save the smaller subset locally, for example. Please see the documentation which I'm copying and pasting (and slightly modifying) the relevant lines here:

>>> from xmitgcm import llcreader
>>> model = llcreader.ECCOPortalLLC4320Model()
>>> ds = model.get_dataset(varnames=['Eta'],iter_start=13680,iter_stop=13681,k_chunksize=90)

Then you will have to decide on the subsampling, but if you knew that you wanted face # 2 you could, for example:

ds.sel(face=2).to_netcdf('my_file.nc')

But still note this will (probably) be quite a large file (check ahead of time with ds.sel(face=2).nbytes). Also, I have essentially no experience with this, but with a large dataset consider looking into saving with zarr with ds.sel(face=2).to_zarr('my_file.zarr').

As for getting the LLC grid onto a regular lat/lon grid - yes this would require an interpolation or regridding algorithm. This is often performed before plotting, which you can see examples of here in the xgcm documentation for ECCOv4 on LLC90 grid or the regridding performed in the ecco_v4_py plotting routines.

Note that any calculation of e.g. transports or averaged quantities will accumulate interpolation errors, and so should ideally be performed before interpolation. You can check out some examples of calculations on the LLC grid (for LLC90 with ECCOv4) at the eccov4py documentation/tutorial.

Hope that helps!

Miaoxiangying commented 4 years ago

@timothyas Thanks for your help! I have tried the llcreader, but there are some errors happened.

  1. ConnectError When I executed the command of model = llcreader.ECCOPortalLLC4320Model(), it may appear the ConnectError sometimes: ClientConnectorError: Cannot connect to host storage.googleapis.com:443 ssl:default [远程主机强迫关闭了一个现有的连接。] I don't know if this error is caused by that my IP address is in China and the connection with ECCO DATA PORTAL is unstable.

  2. TypeError There is a TypeError when I want to create a nctcdf file with to_netcdf:

    from xmitgcm import llcreader
    model = llcreader.ECCOPortalLLC4320Model()
    ds = model.get_dataset(varnames=['Eta'],iter_start=13680,iter_stop=13681,k_chunksize=90)
    ds.sel(face=4).to_netcdf('my_file.nc')

    TypeError: _request() got an unexpected keyword argument 'size_policy I checked the versions of xmitgcm and fsspec as this issue discussed: https://github.com/MITgcm/xmitgcm/issues/175, they are 0.4.1 and 0.8.2.

  3. The latest get_dataset (0.4.1) doesn't have default variables of 'read_grid=True' and 'grid_vars_to_coords=True' which had before, so I don't know how to get the latitude/longitude coordinates marked as 'XC/YC'. The ds can be print as: image

Besides, the interpolation example you recommend is helpful, thank you again!