jeroen / protolite

Fast and Simple Object Serialization to Protocol Buffers
Other
48 stars 8 forks source link

crs argument for read_mvt_sf #14

Closed dmcglinn closed 4 years ago

dmcglinn commented 5 years ago

Thank you so much for your recent implementation of read_mvt_sf this has really helped out my research group that is working with rgbif to query GBIF mvt data files. One quick note is that as currently implemented the function read_mvt_sf naively assigns the crs for the incoming mvt file as EPSG:4386. It would be ideal if this was instead an argument that could be specified on import. For example when using the function rgbif::map_fetch as implemented on the map-fetch-mvt branch of that repo the resulting sf file has the following metadata associated with it:

$occurrence
Simple feature collection with 13349 features and 1 field
geometry type:  POINT
dimension:      XY
bbox:           xmin: -225 ymin: -45.08904 xmax: 224.2969 ymax: 82.76537
epsg (SRID):    4326
proj4string:    +proj=longlat +datum=WGS84 +no_defs

as noted in https://github.com/ropensci/rgbif/issues/373 the field called epsg (SRID) is incorrectly labeled in this case because this mvt file was generated from a file that is in EPSG:3857.

@jeroen would you like to implement that or would you like me to submit a PR for this?

Thanks!

jeroen commented 5 years ago

Can you submit a PR please?

jeroen commented 5 years ago

And maybe add an example and/or test so that I understand what is going on :)

jeroen commented 5 years ago

Maybe we should do the projection using st_transform()? Currently we hardcode it here, but perhaps sf can do a better job.

https://github.com/jeroen/protolite/blob/4c58b19d3add92a1d0fc59425e48b88e68469a51/R/mapbox.R#L109-L115

dmcglinn commented 5 years ago

This may be more complex then I thought, but I was naively assuming that on the following lines: https://github.com/jeroen/protolite/blob/4c58b19d3add92a1d0fc59425e48b88e68469a51/R/mapbox.R#L63-L74

where you have specified crs = 4326 that we would simply allow the user to specify a different EPSG code. I suppose the caveat here is that coordinates must be in decimal degrees as currently implemented so only a subset of EPSG codes that allow decimal degree coordinates would be valid.

here is a full example to demonstrate the current problem with the existing code:

remotes::install_github("ropensci/rgbif@map-fetch-mvt")
library(rgbif)

x <- map_fetch(taxonKey = 2480498, srs = "EPSG:3857", format = ".mvt")
# note map_fetch is calling protolite::read_mvt_sf
x
$occurrence
Simple feature collection with 13349 features and 1 field
geometry type:  POINT
dimension:      XY
bbox:           xmin: -225 ymin: -45.08904 xmax: 224.2969 ymax: 82.76537
epsg (SRID):    4326
proj4string:    +proj=longlat +datum=WGS84 +no_defs

So as you can see above although the srs argument to rgbif::map_fetch was "EPSG:3857" the returned object has epsg set to 4326. This doesn't mess the map up in any way in this case it just is incorrect metadata on the object. If the function map_fetch could pass the srs argument to read_mvt_sf then this would be solved for this specific case.

jeroen commented 5 years ago

I think I had tried 3857 but somehow that didn't work. Maybe you can give it a try.

The coordinates that we get from mvt are just in the [0-1, 0-1] range. Is there a epsg code for this? Then I may be able to feed it to sf and convert it to 4326.

Currently I convert it to lat/lon using the formula below, which I thought was called 4326? https://github.com/jeroen/protolite/blob/4c58b19d3add92a1d0fc59425e48b88e68469a51/R/mapbox.R#L109-L115

I'm not a spacial expect; if there is a better way to do the projection, please send a pr!

jeroen commented 5 years ago

@dmcglinn @sckott I have made some changes, can you try this again?

sckott commented 4 years ago

looks like it's working to me, but want to make sure @dmcglinn agrees

map_fetch(taxonKey = 2480498, year = 2010, format = ".mvt", srs="EPSG:3857", verbose = TRUE)
#> GET /v2/map/occurrence/density/0/0/0.mvt?srs=EPSG%3A3857&taxonKey=2480498&year=2010&style=classic.point
#> $occurrence
#> Simple feature collection with 1963 features and 1 field
#> geometry type:  POINT
#> dimension:      XY
#> bbox:           xmin: -25046890 ymin: -5322463 xmax: 24968610 ymax: 11349370
#> epsg (SRID):    3857
#> proj4string:    +proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +wktext +no_defs
#> First 10 features:
#>    total                  geometry
#> 1     10 POINT (-11740728 4696291)
#> 2      4 POINT (4226662 -313086.1)
#> 3      1  POINT (1956788 -3052589)
#> 4      1  POINT (1956788 -3287404)
#> 5      2 POINT (-13071343 5948635)
#> 6      1  POINT (1956788 -3365675)
#> 7      1  POINT (11427641 1565430)
#> 8     16  POINT (1956788 -3443947)
#> 9      1  POINT (-8531595 5557278)
#> 10     2  POINT (1956788 -3600490)
jeroen commented 4 years ago

I think this should all work now in protolite 2.0. If you find any issues, please open a new issue!