Envirometrix / LandGISmaps

Processing of global environmental layers at various resolutions
92 stars 25 forks source link

SoilGrids vs LandGIS values and averaged responses for user-defined buffer around spatial point #7

Closed thatrandomlurker closed 3 years ago

thatrandomlurker commented 5 years ago

​I discovered that on longitude and latitude locations, similar variables (e.g. clay-percentage) differ between SoilGrids and LandGIS. Is this to be expected and if so which is the preferred database to use in your view?

The GPS location I have are farm locations. However, the actual fields are spread around farms. In raster::extract() I would be able to return averaged values over a defined buffer zone around this point. Given that I am not able to download the gridded files and extract the values myself, is it possible to obtain averaged values within a buffer when querying data from the aforementioned APIs?

thengl commented 5 years ago

Yes the differences between SoilGrids and LandGIS are to be expected. LandGIS (Dec 2018) is more up to date than the last version of SoilGrids (Jul 2017). With LandGIS we have also added more covariates and more points (read more about all potential improvements at: https://github.com/Envirometrix/LandGISmaps#soil-properties-and-classes).

The most up-to-date LandGIS clay percentages you can download directly from: https://doi.org/10.5281/zenodo.1476854

After that you can extract values using buffer functionality of the raster package e.g.:

x = raster::extract(raster("sol_clay.wfraction_usda.3a1a1a_m_250m_b10..10cm_1950..2017_v0.2.tif"), buffer=2000)

Depending on the number of points you use, this might take time however so best run in parallel.

You can also extract values for larger tifs without downloading them by using the LandGIS REST API (e.g. https://landgisapi.opengeohub.org/query/point?lat=51.98488013991664&lon=5.629119873046874&coll=predicted250m&regex=sol_clay.wfraction_usda.3a1a1a_m_250m_.*_1950..2017_v0.2.tif). The complete tutorial is available here: https://github.com/Envirometrix/LandGISmaps#accessing-data and here https://github.com/Envirometrix/LandGISmaps/blob/master/tutorial/Access_LandGIS.R works smooth with up to 50 points.

To use REST to overlay with buffers, you would have to first create buffers around points (see gBuffer from the rgeos package), then randomly sample using spsample or similar (using radial sampling would be even more appropriate). Let me know if this works for you.

thatrandomlurker commented 5 years ago

Thank you very much for your detailed response! With this in mind, I will use LandGIS values within my analysis. I had already discovered the direct downloads and also obtained point-responses using the REST API. Thank you though for summarizing this again, for future reference. My main problem with the REST API was the inability to obtain averaged results for buffers. I will look into gBuffer. However, my dataset already has around 12,000 spatial points.

Maybe a small feedback regarding the direct downloads. On SoilGrids, adjusting the extent prior to downloading a layer eased the task dramatically. All layers I needed sum to just 500mb this way. Maybe this could also be implemented for LandGIS in the future?

thengl commented 5 years ago

It is possible to download also smaller chunks of GeoTIFFs by using the Web Coverage Service (WCS). This is explained in this post. To download just a part of clay map needed for your area you could use e.g.:

https://geoserver.opengeohub.org/landgisgeoserver/ows?service=WCS&version=2.0.1&request=GetCoverage&coverageId=predicted250m:sol_clay.wfraction_usda.3a1a1a_m_250m_b30..30cm_1950..2017_v0.2&subset=Lat(41,45)&subset=Long(32,35)

As you increase the bounding box you will start getting error messages (we had to limit the size of the objects RAM usage otherwise the whole server would suffer), but for smaller areas (<300 x 300 km) works smooth.

thatrandomlurker commented 5 years ago

Thank you very much! This worked like a charm for my desired extent.