Open njtierney opened 7 years ago
Currently to get the summary information about coverage one has to do something like:
library(maxcovr) library(dplyr) #> #> Attaching package: 'dplyr' #> The following objects are masked from 'package:stats': #> #> filter, lag #> The following objects are masked from 'package:base': #> #> intersect, setdiff, setequal, union # subset to be the places with towers built on them. york_selected <- york %>% filter(grade == "I") york_unselected <- york %>% filter(grade != "I") dat_dist <- york_selected %>% nearest(york_crime) head(dat_dist) #> # A tibble: 6 × 22 #> to_id nearest_id distance category #> <dbl> <dbl> <dbl> <chr> #> 1 1 66 165.85752 anti-social-behaviour #> 2 2 48 2086.76298 anti-social-behaviour #> 3 3 55 68.23116 anti-social-behaviour #> 4 4 11 286.34132 anti-social-behaviour #> 5 5 25 535.78713 anti-social-behaviour #> 6 6 20 159.90888 anti-social-behaviour #> # ... with 18 more variables: persistent_id <chr>, date <chr>, #> # lat_to <dbl>, long_to <dbl>, street_id <chr>, street_name <chr>, #> # context <chr>, id <chr>, location_type <chr>, location_subtype <chr>, #> # outcome_status <chr>, long_nearest <dbl>, lat_nearest <dbl>, #> # object_id <int>, desig_id <chr>, pref_ref <int>, name <chr>, #> # grade <chr> dat_dist %>% mutate(is_covered = distance <= 100) %>% summarise_coverage() #> # A tibble: 1 × 7 #> distance_within n_cov n_not_cov pct_cov pct_not_cov dist_avg dist_sd #> <dbl> <int> <int> <dbl> <dbl> <dbl> <dbl> #> 1 100 339 1475 0.1868798 0.8131202 1400.192 1596.676
A function like coverage (or something slightly more descriptive) should behave like nearest and return the coverage.
coverage
nearest
coverage <- function(nearest_df, to_df, distance_cutoff = 100){ nearest_df %>% nearest(to_df) %>% dplyr::mutate(is_covered = distance <= distance_cutoff) %>% summarise_coverage() } york_selected %>% coverage(york_crime) #> # A tibble: 1 × 7 #> distance_within n_cov n_not_cov pct_cov pct_not_cov dist_avg dist_sd #> <dbl> <int> <int> <dbl> <dbl> <dbl> <dbl> #> 1 100 339 1475 0.1868798 0.8131202 1400.192 1596.676 york_crime %>% coverage(york_selected) #> # A tibble: 1 × 7 #> distance_within n_cov n_not_cov pct_cov pct_not_cov dist_avg dist_sd #> <dbl> <int> <int> <dbl> <dbl> <dbl> <dbl> #> 1 100 54 17 0.7605634 0.2394366 119.9247 247.2918
I'll probably need to do some refactoring on summarise_coverage() at some point soon. Would be good if it worked properly with group_by.
summarise_coverage()
group_by
Add a print method for this that shows what the results are doing.
Specifically, state something like "Coverage df1 on df2 ", that clearly explains what the method did.
Currently to get the summary information about coverage one has to do something like:
A function like
coverage
(or something slightly more descriptive) should behave likenearest
and return the coverage.I'll probably need to do some refactoring on
summarise_coverage()
at some point soon. Would be good if it worked properly withgroup_by
.