prioritizr / wdpar

Interface to the World Database on Protected Areas
https://prioritizr.github.io/wdpar
GNU General Public License v3.0
37 stars 5 forks source link

Feature Request: Add functionality to keep UNESCO sites and not yet implemented areas #33

Closed Jo-Schie closed 3 years ago

Jo-Schie commented 3 years ago

Hi @jeffreyhanson . I understand the logic for excluding the UNESCO sites and not yet implemented areas from a country reporting perspective. However, for the assessment of protected areas for e.g. planning purposes, it would be still nice to keep them. Could you provide a TRUE/FALSE parameter to keep these areas if the user wishes (similar to what is also possible with keeping the overlapping polygons).

jeffreyhanson commented 3 years ago

Hi @Jo-Schie, yeah, that's pretty straight forward. I'll push a new version to GitHub in a few minutes with a new exclude_unesco parameter that lets you do this. Can you give it a spin and see if it does what you want?

Jo-Schie commented 3 years ago

Hi @jeffreyhanson . Works very well for me with the command to keep UNESCO sites. One small question though. In the comment above I also asked for an option to keep the not currently implemented areas as well. In your package description function wdpa_clean you state that:

2. Exclude protected areas that are not currently implemented (i.e. exclude areas without the status "Designated", "Inscribed", "Established").

So I just tried this out for Brazil and it seemed to me that anyways this areas are kept (see my example where the 220 proposed areas are not removed). So I wondered if this description is outdated or if this is some kind of bug. Don't get me wrong: I actually would like to keep them and then I just delete them afterwards if necessary, but maybe the users might be confused if you say that the area is deleted during the routine and then they are not.

Here is an example

# test the new functionality of wdpar package
library("devtools")
install_github("https://github.com/prioritizr/wdpar")
library(wdpar)

wdpar_raw <- wdpa_fetch("Colombia")

# ---- unesco -----
# how many UNESCO sites
table(wdpar_raw$DESIG_ENG)
table(wdpar_raw$STATUS)

# exclude Unesco sites
wdpar_raw_clean<-
  wdpa_clean(wdpar_raw,exclude_unesco = TRUE)

# how many UNESCO sites
table(wdpar_raw_clean$DESIG_ENG)

# include Unesco sites
wdpar_raw_clean2<-
  wdpa_clean(wdpar_raw,exclude_unesco = FALSE)

# how many UNESCO sites
table(wdpar_raw_clean2$DESIG_ENG)

# ----- not yet implemented ----- 
# how many UNESCO sites
wdpar_raw <- wdpa_fetch("Brazil")
table(wdpar_raw$STATUS) # there are 220 proposed areas

# include Unesco sites  
wdpar_raw_clean2<-
  wdpa_clean(wdpar_raw,exclude_unesco = FALSE)

table(wdpar_raw$STATUS) # there are still 220 proposed areas
jeffreyhanson commented 3 years ago

Hi @Jo-Schie,

Thanks for confirming that the exclude_unesco works. I'm sorry, I was in a hurry and missed your comment about the statuses. The protected areas with a "Proposed" status should definitely be excluded under the current version of wdpar. I'm having trouble reproducing this bug on my computer though. Could you please try running the code below?

# load package
library(wdpar)

# fetch Brazil data
raw_pa <- wdpa_fetch("Brazil")

# show number of protected areas with different statuses
table(raw_pa$STATUS) # there are 220 proposed areas

# clean the data
## note that erase_overlaps = FALSE is just to reduce time required for this example
clean_pa <- wdpa_clean(raw_pa, exclude_unesco = FALSE, erase_overlaps = FALSE)

# show number of protected areas with different statuses in the cleaned dataset
table(clean_pa$STATUS)

If I'm not mistaken, I think there is a typo in the last line of your code and it should be:

table(wdpar_raw_clean2$STATUS) # there are still 220 proposed areas

Is that right? Sorry, if I'm missing something?

I can implement the functionality to not exclude any protected areas based on status. What would be a good name for the parameter? How does exclude_status sound, wherein exclude_status = TRUE means that "Proposed" statuses would be excluded?

Jo-Schie commented 3 years ago

Dear @jeffreyhanson . You are right. My code had a typo at the end. Sorry for the confusion. Your suggestion sounds great.

I would call it maybe something like 'status' with different options like 'All', 'Implemented', 'Proposed' etc. and People may choose to their likening and provide either one or several categories as a string...

jeffreyhanson commented 3 years ago

Ok great - no worries! I've pushed a new version to GitHub with a retain_status parameter to specify which statuses you want to retain during the cleaning process. You can either (1) specify exactly which statuses you want to keep, or (2) specify a NULL value such that no protected areas are excluded based on their status. Could you please give it a try and see if it does what you want?

jeffreyhanson commented 3 years ago

@Jo-Schie, just to follow up, did that work for you?

Jo-Schie commented 3 years ago

Hi @jeffreyhanson. Currently I get a Topollgy exepton error for processing Brazil.

# load package
remotes::install_github("https://github.com/prioritizr/wdpar")
library(wdpar)

# how many UNESCO sites
wdpar_raw <- wdpa_fetch("Brazil")
table(wdpar_raw$STATUS) # there are 220 proposed areas

# include Unesco sites  
wdpar_raw_clean2<-
  wdpa_clean(wdpar_raw,retain_status = NULL)

# show number of protected areas with different statuses in the cleaned dataset
table(wdpar_raw_clean2$STATUS)

gives

retaining all areas (i.e. not removing areas based on status): ✓
removing UNESCO Biosphere Reserves: ✓
removing points with no reported area: ✓
repairing geometry: ✓
wrapping dateline: ✓
repairing geometry: ✓
projecting areas: ✓
repairing geometry: ✓
buffering by zero: ✓
buffering points: ✓
repairing geometry: ✓
snapping geometry to grid: ✓
repairing geometry: ✓
formatting attribute data: ✓
erasing overlaps: …
[==============================>-] 3095/3188 ( 97%) eta: 18sError in CPL_geos_op2(op, x, y) : 
  Evaluation error: TopologyException: found non-noded intersection between LINESTRING (-6.12702e+06 269849, -6.12694e+06 269914) and LINESTRING (-6.12699e+06 269885, -6.12702e+06 269849) at -6127015.940133037 269849.37915742869.
Jo-Schie commented 3 years ago

I'm still in search for a country where I have other proposed PAs. So far I only found Brazil as a case but I will keep on digging.

jeffreyhanson commented 3 years ago

Thanks for raising this issue with the topology error. I'll take a look at this now.

jeffreyhanson commented 3 years ago

It looks like Brazil has a lot of protected areas and some very complex geometries. May I ask what you intend to do with the data? Depending on what you plan to do next, you might be able to skip the erase overlaps part of the processing? For example, if you want to calculate protected area coverage, then you could just dissolve (or rasterize) the polygons in one step (and this would take care of any overlaps without having to loop through each and every geometries and individually handle overlaps -- which is the default behaviour of wdpa_clean)?

jeffreyhanson commented 3 years ago

@Jo-Schie, after playing around with some o the parameters for wdpa_clean, I was able to successfully clean the Brazil dataset (including the procedure to erase overlaps) by increasing the precision of the spatial data processing. Specifically, I used this code:

wdpar_raw_clean2 <- wdpa_clean(wdpar_raw,retain_status = NULL, simplify_tolerance = 1, geometry_precision = 50000)

Note that using this precision (much, much higher than the default) will increase processing time. Although this level of precision is needed for the Brazil dataset, it's not needed for other countries' datasets --- so I don't really want to set this as the default precision in wdpar.

If you want to automatically process WDPA data without the risk of geometry errors, then you could simply run all the cleaning procedures with this high level of precision (but this could result in much longer run times than needed). Or you could try writing some clever code to adaptively set this precision (e.g. using try-catch's to first attempt cleaning a given dataset at the default precision, and then if that fails, trying a higher level of precision)?

Does that help? Please let me know if you have any questions?

Jo-Schie commented 3 years ago

Hi @jeffreyhanson. Nice! Thank you for that clarification. I agree that Brazilian polygons are a bit messy. Anyways I would need to play around a bit with these parameters at some point because I'm not sure yet how the resulting polygons look like and if and how the simplification parameters affect spatial precision. I'm using the WDPA database to analyze my companies financial support for PAs in Latin America.

I also saw your other packages and a bit of your work with PAs. Looks quite interesting to me. Maybe we can have a chat some day e.g. on Skype or something?

Jo-Schie commented 3 years ago

should I close this issue? Seems solved to me.

jeffreyhanson commented 3 years ago

Ok excellent! Yeah that's a great idea. Depending on what precision you need, you might be able to increase the simplification parameters to speed things up. Awesome - that sounds really interesting! Yeah, I'm always happy to chat about protected areas and conservation planning stuff. Could you please send me an email (listed on my GitHub profile page) and we can find a time that works?

jeffreyhanson commented 3 years ago

Thanks - yeah, I've just closed the issue. For future reference, please feel free to close any issues that you open if you think they have been addressed.