Closed marcosci closed 5 years ago
Great question, Marco! I will test it tomorrow.
Nice work @Nowosad :-)
New benchmark:
> x
Unit: milliseconds
expr min lq mean median uq max neval cld
r_stuff <- foo(podlasie_ccilc) 3607.12295 3907.67668 4058.7611 4013.53287 4169.4011 4826.0062 100 b
cpp <- foo2(podlasie_ccilc) 80.54347 83.02005 114.4029 94.30997 121.7541 361.9254 100 a
.... with rcpp_get_coocurrence_matrix2
. Will start implementing that soon :-)
Timings are good. The results are not (always). That is still a work in progress..
It should be fine now. @marcosci , please test it and let me know if we can merge it with master.
library(landscapemetrics)
library(raster)
#> Loading required package: sp
library(microbenchmark)
data("landscape")
data("podlasie_ccilc")
landscape <- pad_raster(landscape, pad_raster_value = -999,
pad_raster_cells = 1)
podlasie_ccilc <- pad_raster(podlasie_ccilc, pad_raster_value = -999,
pad_raster_cells = 1)
foo <- function(landscape){
adjacent_cells <- raster::adjacent(
landscape,
cells = seq_len(raster::ncell(landscape)),
directions = 4,
pairs = TRUE
)
tb <- table(landscape[adjacent_cells[, 1]],
landscape[adjacent_cells[, 2]])
tb <- unclass(tb)
return(tb)
}
foo2 <- function(landscape){
tb <- landscapemetrics:::rcpp_get_coocurrence_matrix(raster::as.matrix(landscape), directions = 4)
return(tb)
}
x <- microbenchmark(
r_stuff <- foo(podlasie_ccilc),
cpp <- foo2(podlasie_ccilc),
times = 10
)
x
#> Unit: milliseconds
#> expr min lq mean
#> r_stuff <- foo(podlasie_ccilc) 3930.70718 3998.36992 4183.68346
#> cpp <- foo2(podlasie_ccilc) 84.92322 90.93474 99.53851
#> median uq max neval
#> 4195.26915 4257.7961 4674.3710 10
#> 98.78565 108.2687 115.6758 10
all.equal(r_stuff, cpp, check.attributes = FALSE)
#> [1] TRUE
Created on 2018-07-09 by the reprex package (v0.2.0).
Hey @marcosci ,
quick question - do we need all of the files in src/
? E.g. do we use rem-union-merge.cpp
or borders.cpp
?
I thought so, for the connected labeling. But did not look at that for quite some while. We anyways have to fiddle with that, as one can only use the queen's case at the moment for the patch identification.
Are you asking because libs get that big while compiling the package?
Yes. Larger libs means also longer compilation times. I am just curious if we need all of it. If not - we should remove it.
Will dive into that tomorrow :) While doing that - what are we planing with your rcpp functions, internal or public?
I do not have a good answer to that question. It should depends on the users expectations - would they need any of the rcpp functions for a landscape ecology work?
There were for us - so I would assume for others too. I think they have potential to be building blocks for other metrics, and something in the line of:
get_composition <- function(landscape){
rcpp_get_composition_vector(raster::as.matrix(landscape))
}
... exported would be valuable. If you think there is a more appropriate place for them, tell me. :-)
So, Max and I played around with the connected labeling from SDMTools - turns out it is a bit faster and we could also tweak it to work 4 neighbors, instead of only 8.
They use it under a GPL license, I think if we put them as the author of the C function it should be fine to not have SDMTools as a dependency?
That is my understanding as well. See rmarkdown as an example - https://github.com/rstudio/rmarkdown/blob/master/DESCRIPTION (it uses outside code).
Edit: interesting disussion at https://github.com/ropensci/unconf17/issues/32
According to the figure @cboettig is showing it would be allowed - nice.
Then, after now restructuring ~200 functions to make them understand the argument directions
, we should write a proper acknowledgmenent in the C code :+1:
@marcosci, going back to your previous question - I agree that this function could be useful for the users. We just need to come with a better name... Is there any other rcpp function that could be useful externally?
Hm, I would export them all (with better names :yum: ) except for maybe:
?
extract_xxx isn't any better, isn't?
:thinking:
I would prefer an identical prefix to group all of them. How about using get_xxx
, so just remove the rcpp_
part for the name of the R wrapper.
Functions that may be interesting are:
Marco and Max, sorry for the delay with this. I'd been at a conference whole last week, and this week is also crazy. I will think about it and make it done in a week or so (or tomorrow).
Just started to implement the utility functions - nothing set in stone, if you want to change/add/remove/whatever something that is totally fine :+1:
Awesome work!
One small idea - how about using our lsm_
prefix here as well?
E.g. lsm_get_xx()
?
We had the idea to use lsm_ as a prefix only for functions that actually calculate a metric and not for auxiliary or helper functions
Ok, that's fine.
Hey @Nowosad,
we were playing around with your rcpp functions - really handy stuff in there :+1:
Just one weird thing with
rcpp_get_coocurrence_matrix
:.... for landscape it's way faster - but not for podlasie. Do you have anything in mind, why that could be?