Closed timadriaens closed 2 years ago
This workflow, as well as the one for the RIPARIAS climate matching, would be good to consolidate as a package/function, the same way it is done for TrIAS. I think there is great scope for it to be used worldwide by the invasion community, for example for horizon scanning, drafting alert lists, risk assessment etc.
Functionalities:
coordinate uncertainty
(see)BasisOfRecord
@SanderDevisscher are there any other functionalities you can think of?
Maybe it could be useful to be able set the Region you do the matching for (default belgium). can be achieved by (belgium in example):
region <- "belgium"
library(rworldmap)
worldmap <- getMap(resolution = "low")
region_shape <- subset(worldmap, tolower(worldmap$ADMIN) == tolower(region))
plot(region_shape)
close enough :-)
@damianooldoni one question though how do I best store/access the Koppen Geiger shapes ? Currently the are saved as shapes in the riparias-prep repository. I can access these via readOGR(resp. githublinks) but I don't know if this is a good long term solution.
indeed, setting the country (and perhaps also the continent as many horizon scans are made at continental level) - I am forgetting the obvious :-)
(and perhaps also the continent as many horizon scans are made at continental level)
library(rworldmap) worldmap <- getMap(resolution = "low") region <- "AfriCa" if(tolower(region) %in% c("south america", "asia", "africa", "europe", "australia", "antarctica", "north america"){ region_shape <- subset(worldmap, tolower(worldmap$REGION) == tolower(region)) }
plot(region_shape)
![image](https://user-images.githubusercontent.com/16779320/117472444-2dc88c00-af59-11eb-8a37-ffb557cf7592.png)
A package should be self contained. Based on the R Packages book I often consult, I would add the shapefiles to the package (as .RData). In this way, as we have in package DESCRIPTION
, option LazyData: true
, the shapefile will be lazily loaded. This means that it won't occupy any memory until you use it.
@SanderDevisscher: could you please tell me exactly which shapefiles the function will need (directly or indirectly)? Then I can already add them to the package. Thanks. In this way, you can just use them, exaclyt like you can use matcars
when loading dplyr
:smile:
@SanderDevisscher: could you please tell me exactly which shapefiles the function will need (directly or indirectly)? Then I can already add them to the package. Thanks. In this way, you can just use them, exaclyt like you can use
matcars
when loadingdplyr
😄
@damianooldoni currently we use the shapes in https://github.com/inbo/riparias-prep/tree/master/data/spatial/koppen-geiger. I can transform these into .geojson if needed. However we are now discussing whether or not these 2 match sufficiently accurately (see https://github.com/inbo/riparias-prep/discussions/19). This can result in the need to include more shapes in the future.
Transformation in .geojson
is not needed as we will save them in trias package as R objects (.Rdata).
I have just now checked adding perimeter_shape
shapefile as sf data.frame and it worked flawlessly:
class(trias::perimeter_shape)
#> [1] "sf" "data.frame"
nrow(trias::perimeter_shape)
#> [1] 3
Created on 2021-05-10 by the reprex package (v2.0.0)
I would use as much as possible sf
instead of "old" and less maintained sp
package, unless some functionalities have still not be rewritten to support sf (not all pkgs maintainers updated their functions to work with sf as well). But this is a minor issue, as we can save sp
objects as .Rda
files as I have just now tested with sf
objects.
So, if I well understand, it's better to wait until https://github.com/inbo/riparias-prep/discussions/19 is solved before adding objects to trias package?
@damianooldoni I'm more familiar with the sp
package.
So, if I well understand, it's better to wait until inbo/riparias-prep#19 is solved before adding objects to trias package?
~I'm very uncertain this will go anywhere since I can't access the website containing these other shapes.~ UPDATE: A work arround has been found waiting for solution is advisory
great :-) Just let me know in this issue when and what I should add (I can sf object add.:you can always translate them in sp in the function. transforming to sf can be done in a later stage when I have more time.
@damianooldoni an update. I've rewritten the climate matching flow according to https://github.com/inbo/riparias-prep/issues/23. Its currently in review see https://github.com/inbo/riparias-prep/pull/26. When this PR is approved I'll start with the function.
Thanks @SanderDevisscher for the update. Do you know also which shapefiles should be added to the package based on the last climate matching workflow?
@damianooldoni since the PR has been approved I'll start working on the function. @timadriaens I'll leave the plotting of maps for now and start with creating the function according to this flow chart. Notice I plan export the occurrence data as spatialpoints this should allow easy plotting on a map after the function has done its work.
@damianooldoni did you upload the shapes in a branch ? ifso which branch ? shall I continue in this branch?
@SanderDevisscher: no, still not. Maybe you can start a new branch where you can add the shapefiles as Rdata
as first commit?
How to describe them? Via a data.R
file. Get inspired by this example, or this one . Notice how you can indeed split data documentation in different file, but all of them starting with data
(in dplyr you have data-bands.R, data-starwars.R, etc. .
We can split ours in data-observed.R and data-future.R if there a lot of shapefiles to be described.
The .rda
files should be placed in ./data/
folder.
73_climate_matching_function
- branch
as you wish. I am not very picky about branch names :smile:
it is just a headsup I'll be working on this branch ;-)
@damianooldoni I'm having an issue downloading occurrence data from gbif
Error in occ_download_get(set1, overwrite = TRUE) :
response content-type != application/octet-stream; q=0.5
do you have any idea how to solve this ?
@SanderDevisscher : about https://github.com/trias-project/trias/issues/73#issuecomment-844843053 could you give me more details about the error and what set1
is?
@damianooldoni set1 is the list of predicates to setup the download. In this case:
taxonkey_set1 <- pred_in("taxonKey", taxonkey)
set1 <- occ_download(taxonkey_set1,
pred("hasCoordinate", TRUE),
user = gbif_user,
pwd = gbif_pwd,
email = gbif_email)
I get the error when running:
repeat{
Sys.sleep(time = 5)
test_set1 <- occ_download_meta(set1)
if(test_set1$status == "SUCCEEDED"){
data <- occ_download_get(set1,
overwrite = TRUE) %>%
occ_download_import()
break()
}
print(test_set1$status)
}
also when I only run this part:
data <- occ_download_get(set1,
overwrite = TRUE) %>%
occ_download_import()
I see this on https://www.gbif.org/health
This could be the reason.
Highly likely
@SanderDevisscher: I have improved documentation and structure of Rda climate shapes data in https://github.com/trias-project/trias/commit/dcb01b310d42149e43eb338405e0eff223c38eb0. The main issue is that the name you mention in ./R/data.R should be equal to the data.frame within the Rda file. So, I found the best way to solve this by making a list of climate shapes. In this way, observed.Rda contains an object, a list, called observed
with 5 slots and each slot is a spatial dataframe. You can unlist it before using it, or benefit of the list structure if you like. Same for future
and legend
. I improved the name of a legend slot: now both legends are called like the models they refer to: A1FI
and Beck
. Personally, I would remove also the prefix KG_
(what does it mean???) and the suffix _legend
(not needed and present only in one of the legends).
I also improved the documentation: notice how you can document three datasets in one file by means of @name for the first dataset and @rdname for the other ones, so linking to documentation of the first one. By using @family we can automatically link documentation of the functions you are going to add to this documentation as well. Just add same @family field in function doc.
One question: is the overlapping in observed
data correct? I see a shapefile 1976-2000
and a shapefile 1980-2016
. Is this done to have the same duration, 25 years? Thanks.
Personally, I would remove also the prefix KG_ (what does it mean???) and the suffix _legend (not needed and present only in one of the legends).
The KG
comes from Köppen–Geiger climate classification system see https://en.wikipedia.org/wiki/K%C3%B6ppen_climate_classification.
One question: is the overlapping in
observed
data correct? I see a shapefile1976-2000
and a shapefile1980-2016
. Is this done to have the same duration, 25 years? Thanks.
@damianooldoni the shapefile from 1976 - 2000
(and prior) are from Rubel & Kottek 2010 while the one from 1980-2016
is the observed climate according to Beck et al. We've decided to use the 1980-2016
- shape for the observations after 2000, mainly because the 2001 - 2025
shape from Rubel & Kottek 2010 (in the future datapack) is a simulation, not a observation.
Thanks @SanderDevisscher for clarifying this. I have to fine tune the documentation then as this is not true. About KG
in legend, of course :facepalm: I will just have to add Köppen and Geiger
in the legend
documentation.
Voila. Done. Please, give a look. If it's ok, we can move on :partying_face:
After discussion with @timadriaens we decided to try to add 2 map outputs see updated flowchart:
indeed @SanderDevisscher, well done. Good to have some map outputs showing suitable area for the species (and with the realised niche also) under current and future climate, and also a map which uses the %overlap in the legend as a degree of climatic suitability.
The idea was to corroborate the workflow using a set of species currently under PRA for Europe, and for which detailed species distribution models have been drafted to assess climatic and environmental suitability. It would therefore be nice to test it on this set of species:
Species | GbifKey |
---|---|
Marisa cornuarietis | 2292586 |
Cherax destructor | 4417558 |
Mulinia lateralis | 2286388 |
Obama nungara | 8701771 |
Bipalium kewense | 2502938 |
Pycnonotus jocosus | 2486151 |
Cervus nippon | 2440954 |
Delairea odorata | 3134183 |
can you try run the matching?
@timadriaens this are the outputs of the current version of the function for the specieslist you provided. These are the settings used:
climate_match(region = "belgium", taxonkey = c(2292586,
4417558,
2286388,
8701771,
2502938,
2486151,
2440954,
3134183),
zipfile)
_Notice: not setting n_totaal
, perc_climate
, coord_unc
, BasisOfRecord
results in no respective filters being applied._
OK tx, useful! Can you trial the same workflow at the EU level?
Rerun at European (geographic europe) level
Tx, great! I included it in the PRA (let's see if they like it...): "Using the climate matching methodology of Faulkner et al. (2014), and using the available validated distribution data (N = 7,208) of P. cafer in the native and invasive range gathererd for the species distribution model (Annex VIII), we determined the degree (%) overlap with Köppen Geiger climate classification classes (Beck et al. 2018) present in the risk assessment area. This shows that under current climate (2001-2025), there are similarly suitable climates for P. jocosus in the risk assessment area, primarily humid subtropical climate (Cfa, 36% overlap), temperate oceanic climate (Cfb, 16% overlap), to a lesser extent other climates such as hot-summer Mediterranean climate (Csa, 3% overlap) which are present in a number of EU Member States (cf. https://www.plantmaps.com/koppen-climate-classification-map-europe.php). "
It would be useful to include the names of the climate classes on top of the codes in the outputs and to provide map outputs as we discussed (using a colour scaling representing the % overlap). In any case: straightforward, but really great work!
@SanderDevisscher quick cross-check with the results of a detailed SDM: Croatia, Italy and Greece do have Cfa climates so that is corroborated. Portugal and Spain have considerable area of Csa climates. So it looks like the rough climate matching is refined with the SDM but doesn't contradict it, which is great. Here is for instance the predicted suitability under current climate from the SDM, it would be nice to be able to compare this with some map outputs from our crude climate matching workflow
we determined the degree (%) overlap with Köppen Geiger climate classification classes (Beck et al. 2018) present in the risk assessment area.
should be: we determined the degree (%) overlap with Köppen Geiger climate classification classes (Rubel & Kottek 2010, Beck et al. 2018) present in the risk assessment area.
Since we used both Rubel & Kottek (observations prior to 2001) & Beck (observations since 2001). Both are updates of the original Köppen Geiger maps.
tx! updated...
@damianooldoni can you update these files in the legend.rda ?
@timadriaens this is a first draft of current climate matching map for Marisa cornuarietis (Linnaeus, 1758) version 1: na.color = "#e0e0e0"
version 2: na.color = "#053061"
which do you prefer ?
the first one looks awesome (for sure the mainland background should be pale/white). Can we make sure we use a colour blind proof palette? Also, if the sea is blue (which I prefer), we should perhaps not use blue for suitable area. How come there is no red on the map?
And a final thought: the output is a table with KG and %suitable, and a map showing where that is, but still you cannot link the table with the map (i.e. you don't know where the KG climate classes are and with which colours they correspond). Would there be a way to visualize the KG class on the map perhaps?
@SanderDevisscher: while working on updating legend.rda as asked in https://github.com/trias-project/trias/issues/73#issuecomment-867524195, I see that the structure of KG_A1FI.txt is quite different and there are no column names anymore. How should I interpret this file? The structure of both legends up to now is to provide two columns, GRIDCODE
and Classification
. If this changes, please let me know something more so that I can update the documentation. Thanks a lot.
@timadriaens I updated the current climate map to work with multiple species. I also added a popup and changed the color of the sea, see below:
@SanderDevisscher: while working on updating legend.rda as asked in #73 (comment), I see that the structure of KG_A1FI.txt is quite different and there are no column names anymore. How should I interpret this file? The structure of both legends up to now is to provide two columns,
GRIDCODE
andClassification
. If this changes, please let me know something more so that I can update the documentation. Thanks a lot.
Somehow the export got corrupted !! I'll look into it.
I used write_rds for KG_A1FI :-/
should be fixed now KG_A1FI.txt
@timadriaens I finished both the maps you requested. The function now also exports 1 map displaying the current climate match in the "current_map" - slot & a list of future scenario climate matching maps in the "future_maps" - slot.
In dit voorbeeld de climate matching voor de periode 2076 - 2100 van de goldenhorn
@damianooldoni This function is allmost ready for review. Just the legend files need to be updated.
@timadriaens Do have some time to look at the function in the coming days ?
Hi @SanderDevisscher: legend rda file is updated 👍 I also improved the documentation of the rda files: I split such documentation in three different files as it improved a lot the readibility. I made some basic maintenance of the package as well. Notice also that you can use normal Markdown syntax to write package documentation in R since a year or so. This is enabled by line Roxygen: list(markdown = TRUE)
in DESCRIPTION.
@damianooldoni within the RIPARIAS project @SanderDevisscher and me developed a procedure for climate matching, the idea is described in this issue and further developed in the RIPARIAS repo for crayfish and aquatic plants.
This issue is meant to make this into a function within the trias package.