rspatial / terra

R package for spatial data handling https://rspatial.github.io/terra/reference/terra-package.html
GNU General Public License v3.0
543 stars 90 forks source link

`project()` irregularity with no internet connection #1425

Open kevinwolz opened 9 months ago

kevinwolz commented 9 months ago

Attached here are two files, a raster with CRS “Albers_Conical_Equal_Area” and a polygon that overlaps with the raster but has CRS “NAD83 / Conus Albers”. The goal is to crop the raster to the polygon.

Archive.zip

To replicate this issue, R must not have internet access.

First, read in the data.

r <- terra::rast("test_raster.tif")
a <- terra::vect("test_aoi.shp")

Project the polygon to the raster to prepare for the crop. This produces a warning message. I’m not sure why, but it does.

a_trans <- terra::project(a, terra::crs(r))
# Warning messages:
# 1: PROJ: Cannot open https://cdn.proj.org/us_noaa_mihpgn.tif: Could not resolve host: cdn.proj.org (GDAL error 1) 
# 2: Network error when accessing a remote resource (GDAL error 1) 

Let’s try the crop

bad <- terra::crop(r, a_trans, snap = "out")
# Error: [crop] extents do not overlap

This produces an error! Why? Let’s look at r and a_trans:

r

Screen Shot 2024-02-11 at 2 10 58 PM

a_trans

Screen Shot 2024-02-11 at 2 10 41 PM

ISSUE 1: a_trans has a very weird extent…the whole globe! But in totally the wrong units.

However, that’s not even the weirdest part. Let’s run that same exact code again, in the same R session and still without internet.

a_trans <- terra::project(a_vect, terra::crs(r))

No warning this time. What does a-trans look like now?

a_trans

Screen Shot 2024-02-11 at 2 13 35 PM

It looks good now. It found itself! Try the crop again:

good <- terra::crop(r, a.trans, snap = "out")

Successful!

ISSUE 2: terra::project() acts differently the second time around.

I see in #1373 that "grids are used for transforming coordinates between datums. They are downloaded as needed.". However if that's the case here, I would think that terra::project() should throw an error if it can't perform properly. But then again, it did figure out the second time how to perform properly even without the internet to save it!

rhijmans commented 9 months ago

It is very difficult to respond without a minimal, self-contained, reproducible example. It would seem possible to create an example from your data; but you complicated that by not including the R output describing you data. (you included screen-shots; please never do that). Can you edit your issue so that we can try to reproduce this?

kevinwolz commented 9 months ago

You'll see in the first sentence that I meant to attach the two files used in the example. I forgot to do that. They are attached now.

rhijmans commented 9 months ago

Thanks, I did miss that. Your data suggest this as a repex:

library(terra)
a <- vect("POLYGON ((491731 2267048, 491731 2268475, 493306 2268475, 493306 2267048, 491731 2267048))", crs="EPSG:5070")
r <- rast(ncols=119, nrows=115, nlyrs=1, xmin=490725, xmax=494295, ymin=2266035, ymax=2269485, names=c('NLCD Land Cover Class'), crs='PROJCRS[\"Albers_Conical_Equal_Area\",BASEGEOGCRS[\"WGS 84\",DATUM[\"World Geodetic System 1984\",ELLIPSOID[\"WGS 84\",6378137,298.257223563,LENGTHUNIT[\"metre\",1]]],PRIMEM[\"Greenwich\",0,ANGLEUNIT[\"degree\",0.0174532925199433]],ID[\"EPSG\",4326]],CONVERSION[\"Albers Equal Area\",METHOD[\"Albers Equal Area\",ID[\"EPSG\",9822]],PARAMETER[\"Latitude of false origin\",23,ANGLEUNIT[\"degree\",0.0174532925199433],ID[\"EPSG\",8821]],PARAMETER[\"Longitude of false origin\",-96,ANGLEUNIT[\"degree\",0.0174532925199433],ID[\"EPSG\",8822]],PARAMETER[\"Latitude of 1st standard parallel\",29.5,ANGLEUNIT[\"degree\",0.0174532925199433],ID[\"EPSG\",8823]],PARAMETER[\"Latitude of 2nd standard parallel\",45.5,ANGLEUNIT[\"degree\",0.0174532925199433],ID[\"EPSG\",8824]],PARAMETER[\"Easting at false origin\",0,LENGTHUNIT[\"metre\",1],ID[\"EPSG\",8826]],PARAMETER[\"Northing at false origin\",0,LENGTHUNIT[\"metre\",1],ID[\"EPSG\",8827]]],CS[Cartesian,2],AXIS[\"easting\",east,ORDER[1],LENGTHUNIT[\"metre\",1,ID[\"EPSG\",9001]]],AXIS[\"northing\",north,ORDER[2],LENGTHUNIT[\"metre\",1,ID[\"EPSG\",9001]]]]')

a_trans <- terra::project(a, terra::crs(r))
# etc

(Using actual files really complicates debugging)

kevinwolz commented 8 months ago

Sorry for the delay - confirming now that the repex you provided does produce the same warnings/errors.

brownag commented 8 months ago

I am not sure why the first call to terra::project() changes the CRS to the target projected system, while giving results in geographic coordinate extent---but I think the second time, when grids are presumably still not available, it could be falling back to a transformation which is slightly less accurate (and not using a grid to do so). The following may not be helpful, but I typed it all out so here ya go.

You can inspect pipelines PROJ uses for projection, with and without the datum shift grids, using sf::sf_proj_pipelines(). The argument grid_availability can be used to simulate what happens with and without network access.

For example:

library(terra)
#> terra 1.7.73
a <- vect("POLYGON ((491731 2267048, 491731 2268475, 493306 2268475, 493306 2267048, 491731 2267048))", crs="EPSG:5070")
r <- rast(ncols=119, nrows=115, nlyrs=1, xmin=490725, xmax=494295, ymin=2266035, ymax=2269485, names=c('NLCD Land Cover Class'), crs='PROJCRS[\"Albers_Conical_Equal_Area\",BASEGEOGCRS[\"WGS 84\",DATUM[\"World Geodetic System 1984\",ELLIPSOID[\"WGS 84\",6378137,298.257223563,LENGTHUNIT[\"metre\",1]]],PRIMEM[\"Greenwich\",0,ANGLEUNIT[\"degree\",0.0174532925199433]],ID[\"EPSG\",4326]],CONVERSION[\"Albers Equal Area\",METHOD[\"Albers Equal Area\",ID[\"EPSG\",9822]],PARAMETER[\"Latitude of false origin\",23,ANGLEUNIT[\"degree\",0.0174532925199433],ID[\"EPSG\",8821]],PARAMETER[\"Longitude of false origin\",-96,ANGLEUNIT[\"degree\",0.0174532925199433],ID[\"EPSG\",8822]],PARAMETER[\"Latitude of 1st standard parallel\",29.5,ANGLEUNIT[\"degree\",0.0174532925199433],ID[\"EPSG\",8823]],PARAMETER[\"Latitude of 2nd standard parallel\",45.5,ANGLEUNIT[\"degree\",0.0174532925199433],ID[\"EPSG\",8824]],PARAMETER[\"Easting at false origin\",0,LENGTHUNIT[\"metre\",1],ID[\"EPSG\",8826]],PARAMETER[\"Northing at false origin\",0,LENGTHUNIT[\"metre\",1],ID[\"EPSG\",8827]]],CS[Cartesian,2],AXIS[\"easting\",east,ORDER[1],LENGTHUNIT[\"metre\",1,ID[\"EPSG\",9001]]],AXIS[\"northing\",north,ORDER[2],LENGTHUNIT[\"metre\",1,ID[\"EPSG\",9001]]]]')

sf::sf_proj_pipelines(crs(a), crs(r), grid_availability = "AVAILABLE")
#> Candidate coordinate operations found:  53 
#> Strict containment:     FALSE 
#> Axis order auth compl:  FALSE 
#> Source:  PROJCRS["NAD83 / Conus Albers",
#>     BASEGEOGCRS["NAD83",
#>         DATUM["North American Datum 1983",
#>             ELLIPSOID["GRS 1980",6378137,298.257222101,
#>                 LENGTHUNIT["metre",1]]],
#>         PRIMEM["Greenwich",0,
#>             ANGLEUNIT["degree",0.0174532925199433]],
#>         ID["EPSG",4269]],
#>     CONVERSION["Conus Albers",
#>         METHOD["Albers Equal Area",
#>             ID["EPSG",9822]],
#>         PARAMETER["Latitude of false origin",23,
#>             ANGLEUNIT["degree",0.0174532925199433],
#>             ID["EPSG",8821]],
#>         PARAMETER["Longitude of false origin",-96,
#>             ANGLEUNIT["degree",0.0174532925199433],
#>             ID["EPSG",8822]],
#>         PARAMETER["Latitude of 1st standard parallel",29.5,
#>             ANGLEUNIT["degree",0.0174532925199433],
#>             ID["EPSG",8823]],
#>         PARAMETER["Latitude of 2nd standard parallel",45.5,
#>             ANGLEUNIT["degree",0.0174532925199433],
#>             ID["EPSG",8824]],
#>         PARAMETER["Easting at false origin",0,
#>             LENGTHUNIT["metre",1],
#>             ID["EPSG",8826]],
#>         PARAMETER["Northing at false origin",0,
#>             LENGTHUNIT["metre",1],
#>             ID["EPSG",8827]]],
#>     CS[Cartesian,2],
#>         AXIS["easting (X)",east,
#>             ORDER[1],
#>             LENGTHUNIT["metre",1]],
#>         AXIS["northing (Y)",north,
#>             ORDER[2],
#>             LENGTHUNIT["metre",1]],
#>     USAGE[
#>         SCOPE["Data analysis and small scale data presentation for contiguous lower 48 states."],
#>         AREA["United States (USA) - CONUS onshore - Alabama; Arizona; Arkansas; California; Colorado; Connecticut; Delaware; Florida; Georgia; Idaho; Illinois; Indiana; Iowa; Kansas; Kentucky; Louisiana; Maine; Maryland; Massachusetts; Michigan; Minnesota; Mississippi; Missouri; Montana; Nebraska; Nevada; New Hampshire; New Jersey; New Mexico; New York; North Carolina; North Dakota; Ohio; Oklahoma; Oregon; Pennsylvania; Rhode Island; South Carolina; South Dakota; Tennessee; Texas; Utah; Vermont; Virginia; Washington; West Virginia; Wisconsin; Wyoming."],
#>         BBOX[24.41,-124.79,49.38,-66.91]],
#>     ID["EPSG",5070]] 
#> Target:  PROJCRS["Albers_Conical_Equal_Area",
#>     BASEGEOGCRS["WGS 84",
#>         DATUM["World Geodetic System 1984",
#>             ELLIPSOID["WGS 84",6378137,298.257223563,
#>                 LENGTHUNIT["metre",1]]],
#>         PRIMEM["Greenwich",0,
#>             ANGLEUNIT["degree",0.0174532925199433]],
#>         ID["EPSG",4326]],
#>     CONVERSION["Albers Equal Area",
#>         METHOD["Albers Equal Area",
#>             ID["EPSG",9822]],
#>         PARAMETER["Latitude of false origin",23,
#>             ANGLEUNIT["degree",0.0174532925199433],
#>             ID["EPSG",8821]],
#>         PARAMETER["Longitude of false origin",-96,
#>             ANGLEUNIT["degree",0.0174532925199433],
#>             ID["EPSG",8822]],
#>         PARAMETER["Latitude of 1st standard parallel",29.5,
#>             ANGLEUNIT["degree",0.0174532925199433],
#>             ID["EPSG",8823]],
#>         PARAMETER["Latitude of 2nd standard parallel",45.5,
#>             ANGLEUNIT["degree",0.0174532925199433],
#>             ID["EPSG",8824]],
#>         PARAMETER["Easting at false origin",0,
#>             LENGTHUNIT["metre",1],
#>             ID["EPSG",8826]],
#>         PARAMETER["Northing at false origin",0,
#>             LENGTHUNIT["metre",1],
#>             ID["EPSG",8827]]],
#>     CS[Cartesian,2],
#>         AXIS["easting",east,
#>             ORDER[1],
#>             LENGTHUNIT["metre",1,
#>                 ID["EPSG",9001]]],
#>         AXIS["northing",north,
#>             ORDER[2],
#>             LENGTHUNIT["metre",1,
#>                 ID["EPSG",9001]]]] 
#> Best instantiable operation has accuracy: 1.5 m
#> Description: Inverse of Conus Albers + NAD83 to WGS 84 (6) + Albers Equal
#>              Area
#> Definition:  +proj=pipeline +step +inv +proj=aea +lat_0=23 +lon_0=-96
#>              +lat_1=29.5 +lat_2=45.5 +x_0=0 +y_0=0 +ellps=GRS80
#>              +step +proj=hgridshift +grids=ca_nrc_NA83SCRS.tif
#>              +step +proj=aea +lat_0=23 +lon_0=-96 +lat_1=29.5
#>              +lat_2=45.5 +x_0=0 +y_0=0 +ellps=WGS84
sf::sf_proj_pipelines(crs(a), crs(r), grid_availability = "DISCARD")
#> Candidate coordinate operations found:  4 
#> Strict containment:     FALSE 
#> Axis order auth compl:  FALSE 
#> Source:  PROJCRS["NAD83 / Conus Albers",
#>     BASEGEOGCRS["NAD83",
#>         DATUM["North American Datum 1983",
#>             ELLIPSOID["GRS 1980",6378137,298.257222101,
#>                 LENGTHUNIT["metre",1]]],
#>         PRIMEM["Greenwich",0,
#>             ANGLEUNIT["degree",0.0174532925199433]],
#>         ID["EPSG",4269]],
#>     CONVERSION["Conus Albers",
#>         METHOD["Albers Equal Area",
#>             ID["EPSG",9822]],
#>         PARAMETER["Latitude of false origin",23,
#>             ANGLEUNIT["degree",0.0174532925199433],
#>             ID["EPSG",8821]],
#>         PARAMETER["Longitude of false origin",-96,
#>             ANGLEUNIT["degree",0.0174532925199433],
#>             ID["EPSG",8822]],
#>         PARAMETER["Latitude of 1st standard parallel",29.5,
#>             ANGLEUNIT["degree",0.0174532925199433],
#>             ID["EPSG",8823]],
#>         PARAMETER["Latitude of 2nd standard parallel",45.5,
#>             ANGLEUNIT["degree",0.0174532925199433],
#>             ID["EPSG",8824]],
#>         PARAMETER["Easting at false origin",0,
#>             LENGTHUNIT["metre",1],
#>             ID["EPSG",8826]],
#>         PARAMETER["Northing at false origin",0,
#>             LENGTHUNIT["metre",1],
#>             ID["EPSG",8827]]],
#>     CS[Cartesian,2],
#>         AXIS["easting (X)",east,
#>             ORDER[1],
#>             LENGTHUNIT["metre",1]],
#>         AXIS["northing (Y)",north,
#>             ORDER[2],
#>             LENGTHUNIT["metre",1]],
#>     USAGE[
#>         SCOPE["Data analysis and small scale data presentation for contiguous lower 48 states."],
#>         AREA["United States (USA) - CONUS onshore - Alabama; Arizona; Arkansas; California; Colorado; Connecticut; Delaware; Florida; Georgia; Idaho; Illinois; Indiana; Iowa; Kansas; Kentucky; Louisiana; Maine; Maryland; Massachusetts; Michigan; Minnesota; Mississippi; Missouri; Montana; Nebraska; Nevada; New Hampshire; New Jersey; New Mexico; New York; North Carolina; North Dakota; Ohio; Oklahoma; Oregon; Pennsylvania; Rhode Island; South Carolina; South Dakota; Tennessee; Texas; Utah; Vermont; Virginia; Washington; West Virginia; Wisconsin; Wyoming."],
#>         BBOX[24.41,-124.79,49.38,-66.91]],
#>     ID["EPSG",5070]] 
#> Target:  PROJCRS["Albers_Conical_Equal_Area",
#>     BASEGEOGCRS["WGS 84",
#>         DATUM["World Geodetic System 1984",
#>             ELLIPSOID["WGS 84",6378137,298.257223563,
#>                 LENGTHUNIT["metre",1]]],
#>         PRIMEM["Greenwich",0,
#>             ANGLEUNIT["degree",0.0174532925199433]],
#>         ID["EPSG",4326]],
#>     CONVERSION["Albers Equal Area",
#>         METHOD["Albers Equal Area",
#>             ID["EPSG",9822]],
#>         PARAMETER["Latitude of false origin",23,
#>             ANGLEUNIT["degree",0.0174532925199433],
#>             ID["EPSG",8821]],
#>         PARAMETER["Longitude of false origin",-96,
#>             ANGLEUNIT["degree",0.0174532925199433],
#>             ID["EPSG",8822]],
#>         PARAMETER["Latitude of 1st standard parallel",29.5,
#>             ANGLEUNIT["degree",0.0174532925199433],
#>             ID["EPSG",8823]],
#>         PARAMETER["Latitude of 2nd standard parallel",45.5,
#>             ANGLEUNIT["degree",0.0174532925199433],
#>             ID["EPSG",8824]],
#>         PARAMETER["Easting at false origin",0,
#>             LENGTHUNIT["metre",1],
#>             ID["EPSG",8826]],
#>         PARAMETER["Northing at false origin",0,
#>             LENGTHUNIT["metre",1],
#>             ID["EPSG",8827]]],
#>     CS[Cartesian,2],
#>         AXIS["easting",east,
#>             ORDER[1],
#>             LENGTHUNIT["metre",1,
#>                 ID["EPSG",9001]]],
#>         AXIS["northing",north,
#>             ORDER[2],
#>             LENGTHUNIT["metre",1,
#>                 ID["EPSG",9001]]]] 
#> Best instantiable operation has accuracy: 4 m
#> Description: Inverse of Conus Albers + NAD83 to WGS 84 (1) + Albers Equal
#>              Area
#> Definition:  +proj=pipeline +step +inv +proj=aea +lat_0=23 +lon_0=-96
#>              +lat_1=29.5 +lat_2=45.5 +x_0=0 +y_0=0 +ellps=GRS80
#>              +step +proj=aea +lat_0=23 +lon_0=-96 +lat_1=29.5
#>              +lat_2=45.5 +x_0=0 +y_0=0 +ellps=WGS84

a_trans <- project(a, crs(r))
crs(a_trans) == crs(r)
#> [1] TRUE

Note when grids are available you have 53 candidate operations, best accuracy 1.5m, the pipeline used is:

#> Definition:  +proj=pipeline +step +inv +proj=aea +lat_0=23 +lon_0=-96
#>              +lat_1=29.5 +lat_2=45.5 +x_0=0 +y_0=0 +ellps=GRS80
#>              +step +proj=hgridshift +grids=ca_nrc_NA83SCRS.tif
#>              +step +proj=aea +lat_0=23 +lon_0=-96 +lat_1=29.5
#>              +lat_2=45.5 +x_0=0 +y_0=0 +ellps=WGS84

Whereas when grids are discarded you have 4 candidate operations, best accuracy 4m:

#> Definition:  +proj=pipeline +step +inv +proj=aea +lat_0=23 +lon_0=-96
#>              +lat_1=29.5 +lat_2=45.5 +x_0=0 +y_0=0 +ellps=GRS80
#>              +step +proj=aea +lat_0=23 +lon_0=-96 +lat_1=29.5
#>              +lat_2=45.5 +x_0=0 +y_0=0 +ellps=WGS84

I have studied this kind of issue in some detail for datum shifts to/from NAD27, where the accuracy can be (or at least was at the time) severely degraded depending on source CRS and when the grids were absent--sometimes resulting in "ballpark" transformations... but in this case (AEA in NAD83 vs. WGS84) it seems the accuracy is not that different without +step +proj=hgridshift +grids=ca_nrc_NA83SCRS.tif step in the pipeline

I think another issue with reproducibility here is that the grid files can be cached, and info stored in the PROJ database. I was initially able to reproduce the PROJ errors/warnings with terra and sf, but after running sf::sf_proj_network(enable=TRUE) with the network available the projection in subsequent runs works without warning, the first time, even if network connection is subsequently lost.

After removing "/home/andrew/.local/share/proj/cache.db" (find yours with sf::sf_proj_search_paths()) you get the same warnings/PROJ errors back. The warnings also occur when running sf::sf_proj_pipelines(crs(a), crs(r), grid_availability = "AVAILABLE") when cache.db is incomplete and network disabled.