GenericMappingTools / pygmt

A Python interface for the Generic Mapping Tools.
https://www.pygmt.org
BSD 3-Clause "New" or "Revised" License
747 stars 216 forks source link

Refactor tests to use a more stable grid instead of earth_relief #505

Closed weiji14 closed 4 years ago

weiji14 commented 4 years ago

Description of the problem

A lot of our unit tests are dependent on the earth relief grids, and we keep having to change them every so often (see e.g. #350, #401, #452), and probably soon again (see https://github.com/GenericMappingTools/gmtserver-admin/issues/72)! We can keep a few tests for load_earth_relief grid, but I'd like to propose that we test the other functions on a more stable grid. Specifically, these functions:

Any good ones from https://github.com/GenericMappingTools/gmtserver-admin/tree/master/cache that would be suitable?

seisman commented 4 years ago

probably soon again (see GenericMappingTools/gmtserver-admin#72)!

That's only for the 03s and 01s earth relief data. Lower resolution (>=15s) are derived from the SRTM15+V2.1 data (https://doi.org/10.1029/2019EA000658), but of course it may also get some updates in the future.

seisman commented 4 years ago

Any good ones from https://github.com/GenericMappingTools/gmtserver-admin/tree/master/cache that would be suitable?

Besides what you propose, we have a few more choices:

  1. Store the earth_relief_01d_p and earth_relief_01d_g in the tests/data directory. These two files are only ~115 Kb each.
  2. Add the two files to the GMT cache and never change them again (of course, need to change them to other names)
weiji14 commented 4 years ago
  1. Store the earth_relief_01d_p and earth_relief_01d_g in the tests/data directory. These two files are only ~115 Kb each.

This is quite tempting, but I'm not so sure about this. It'll be strange having 'out of date' information in the future.

  1. Add the two files to the GMT cache and never change them again (of course, need to change them to other names)

It would be nice if we could have a data versioning scheme, but that might be too much of an ask. I did look at the Earth Day/Night images, but wasn't sure if they get updated often?

Yet another option is to generate a random synthetic grid and use it for our tests. Maybe we could use Liam's "Ackley function" grid :smile:

weiji14 commented 4 years ago

How about using @tut_bathy.nc (11KB) and/or @tut_relief.nc (397KB)? They're used in Session 3 and Session 4 of the GMT Tutorials. These are the grdinfo for it:

~/.gmt/cache/tut_bathy.nc: Title: ETOPO5 global topography
~/.gmt/cache/tut_bathy.nc: Command: grdreformat -fg bermuda.grd bermuda.nc=ns
~/.gmt/cache/tut_bathy.nc: Remark: /home/elepaio5/data/grids/etopo5.i2
~/.gmt/cache/tut_bathy.nc: Gridline node registration used [Geographic grid]
~/.gmt/cache/tut_bathy.nc: Grid file format: ns = GMT netCDF format (16-bit integer), CF-1.7
~/.gmt/cache/tut_bathy.nc: x_min: -66 x_max: -60 x_inc: 0.0833333333333 (5 min) name: Longitude n_columns: 73
~/.gmt/cache/tut_bathy.nc: y_min: 30 y_max: 35 y_inc: 0.0833333333333 (5 min) name: Latitude n_rows: 61
~/.gmt/cache/tut_bathy.nc: z_min: -5475 z_max: -89 name: Topography [m]
~/.gmt/cache/tut_bathy.nc: scale_factor: 1 add_offset: 0
~/.gmt/cache/tut_bathy.nc: format: classic
~/.gmt/cache/tut_relief.nc: Title: Produced by grdreformat
~/.gmt/cache/tut_relief.nc: Command: grdreformat -fg usgs_30c_dem.i2 us.nc=ns
~/.gmt/cache/tut_relief.nc: Remark: /home/aa5/wessel/dem/usgs_30c_dem.i2
~/.gmt/cache/tut_relief.nc: Pixel node registration used [Geographic grid]
~/.gmt/cache/tut_relief.nc: Grid file format: ns = GMT netCDF format (16-bit integer), CF-1.7
~/.gmt/cache/tut_relief.nc: x_min: -108 x_max: -103 x_inc: 0.00833333333333 (30 sec) name: longitude n_columns: 600
~/.gmt/cache/tut_relief.nc: y_min: 35 y_max: 40 y_inc: 0.00833333333333 (30 sec) name: latitude n_rows: 600
~/.gmt/cache/tut_relief.nc: z_min: 1052 z_max: 4328 name: Topography [m]
~/.gmt/cache/tut_relief.nc: scale_factor: 1 add_offset: 0
~/.gmt/cache/tut_relief.nc: format: netCDF-4 chunk_size: 150,150 shuffle: on deflation_level: 9

Even if they do change in the future, we could pull them from Github at https://github.com/GenericMappingTools/gmtserver-admin instead of the GMT Data Server for the tests.

weiji14 commented 4 years ago

Hmm, I just remembered that xarray has a cool xr.tutorial.open_dataset function that can load a bunch of example datasets from https://github.com/pydata/xarray-data, but we can also point it to our GMT one at https://github.com/GenericMappingTools/gmtserver-admin.

  1. Store the earth_relief_01d_p and earth_relief_01d_g in the tests/data directory. These two files are only ~115 Kb each.
  2. Add the two files to the GMT cache and never change them again (of course, need to change them to other names)

Perhaps we could add the low resolution earth_relief_01d_* grids to https://github.com/GenericMappingTools/gmtserver-admin, and create tags at every release (e.g. 6.1, 6.1.1) as a basic data versioning scheme. If they change in the future, we'll just use xr.tutorial.open_dataset to point to the old version. Not to sure how sustainable/future-proof this would be though.

Alternatively, we could have a function like pygmt.dataset.open_dataset() that wraps around xr.tutorial.open_dataset, but which points to https://github.com/GenericMappingTools/gmtserver-admin instead of https://github.com/pydata/xarray-data. This would be similar to GMT which, except it relies on Github instead of the GMT servers, which may or may not be a good thing.

weiji14 commented 4 years ago

Ok bad idea, maybe don't wrap around xr.tutorial.open_dataset, because it requires each tutorial file to have a .md5 file hash (see https://github.com/pydata/xarray/blob/v0.15.1/xarray/tutorial.py#L85-L93). What we could learn from xr.tutorial.open_dataset is to make pygmt.which recognize alternative sources/mirrors like https://github.com/GenericMappingTools/gmtserver-admin, but that's another chore for another day.

weiji14 commented 4 years ago

Closing as this isn't relevant anymore. We're now using check_figures_equal (see #555) to directly compare baseline and test images, i.e. changes to the earth_relief grid shouldn't break the tests anymore.