RMSE_test calculation does not sample points along groundtruth grid edges properly

Commit 054e2953210f24844654ce78920d4afd834c5174 in #151 highlighted a bug introduced in #149. Basically, pygmt.grdtrack differs in sampling points along the edges depending on whether we use a xarray.DataArray or NetCDF file grid input. See image below showing points sampling the 2007tx.nc grid, specifically the 2007t1.txt area.

Difference in pygmt.grdtrack sampled datapoints when run on raw xarray.DataArray and NetCDF file

Yes we do crop the 2007tx.nc grid by one pixel on the left, bottom, right and top (to make the image shape divisible by 4), but there's still some serious discrepancies.

Number of points:

Actual total from 2007t1.txt + 2007tr.txt = 42995 points
pygmt.grdtrack sample on NetCDF file = 38112 points
pygmt.grdtrack sample on xr.DataArray = 37829 points

Strangely enough, running it on an xr.DataArray captures more points on the top and bottom (y-direction) whereas running it on a NetCDF file captures more points on the left and right (x-direction).

How to fix

Adjust data_prep.xyz_to_grid to not use tight bounds from data_prep.get_region. Maybe buffer the input bounds by 250m * 3 pixels (the mask we set when running pygmt.surface) before running blockmedian and surface. This should mean we get closer to the actual total of 42995 points regardless of whether we run it on an xarray.DataArray or NetCDF file

If not, then it might be a good idea to report this upstream to whoever wrote that wrapper for pygmt.grdtrack :sweat_smile:

weiji14 / deepbedmap

RMSE_test calculation does not sample points along groundtruth grid edges properly #152