walkerke / mapboxapi

R interface to Mapbox web services
https://walker-data.com/mapboxapi/
Other
110 stars 8 forks source link

mb_matrix when coord_size >25 (but origin & destination both <25) #42

Closed HunterRatliff1 closed 1 year ago

HunterRatliff1 commented 1 year ago

I'm a huge fan of your work, and I'm very impressed with how well all of your packages run!

I ran into a small bug with the mb_matrix function. It only seems to occur when coord_size is over 25, but neither origins nor destinations sizes are >25 individually. There doesn't seem to be any issues when coord_size is less than 25 and it behaves perfectly if either origin_size or dest_size is >25. A reproducible example is below:

Reprex

Set up an example data from North Texas

library(sf)
library(tigris)
library(mapboxapi)
library(dplyr)
options(tigris_class = "sf")

# FIPS codes for 30 counties in north texas
north_TX <- c("48035", "48193", "48139", "48379", "48349", "48467", "48217", 
              "48439", "48237", "48143", "48093", "48363", "48251", "48077", 
              "48309", "48497", "48085", "48147", "48397", "48257", "48113", 
              "48425", "48097", "48181", "48231", "48367", "48221", "48337", 
              "48429", "48121")

tx_counties <- tigris::counties(state = "tx", cb=T) %>%
  filter(GEOID %in% north_TX) %>% # Limit to those counties above

  # Use centroid distances (mapboxapi function does this by default, but
  # making it explicit reduces the messages and makes the output cleaner)
  st_centroid()

This is just a wrapper function to demonstrate how the mapboxapi::mb_matrix function behaves with differing numbers of origins & destinations

wrap_fxn <- function(data, n_orgs=20, n_dest=5, allow_large=FALSE){

  # Print the number of  origins & destinations
  cli::cli_alert_info(c("Origins: {.strong {n_orgs}}   ", 
                        "Destinations: {.strong {n_dest}} ",
                        "({.field coord_size} = {n_orgs+n_dest})"))

  mapboxapi::mb_matrix(origins      = head(data, n=n_orgs),
                       destinations = tail(data, n=n_dest), 
                       allow_large_matrix = allow_large)
}

These are some examples of cases that don't work. From my (non-reproducible) analysis, all the cases I've ran into issues with were conditions where coord_size > 25 (but neither origins nor destinations sizes were >25 alone). I'm happy to share my data (save you from expensive API calls while troubleshooting).

### These don't work 
tx_counties %>% wrap_fxn(n_orgs = 21, n_dest = 5)
#> ℹ Origins: 21   Destinations: 5 (coord_size = 26)
#> Splitting your matrix request into smaller chunks and re-assembling the result.
#> [1] "Too many coordinates; maximum number of coordinates is 25."
#> Error: Too many coordinates; maximum number of coordinates is 25.

tx_counties %>% wrap_fxn(n_orgs = 23, n_dest = 6)
#> ℹ Origins: 23   Destinations: 6 (coord_size = 29)
#> Splitting your matrix request into smaller chunks and re-assembling the result.
#> [1] "Too many coordinates; maximum number of coordinates is 25."
#> Error: Too many coordinates; maximum number of coordinates is 25.

tx_counties %>% wrap_fxn(n_orgs = 23, n_dest = 11)
#> ℹ Origins: 23   Destinations: 11 (coord_size = 34)
#> Splitting your matrix request into smaller chunks and re-assembling the result.
#> [1] "Too many coordinates; maximum number of coordinates is 25."
#> Error: Too many coordinates; maximum number of coordinates is 25.

tx_counties %>% wrap_fxn(n_orgs = 20, n_dest = 6)
#> ℹ Origins: 20   Destinations: 6 (coord_size = 26)
#> Splitting your matrix request into smaller chunks and re-assembling the result.
#> [1] "Too many coordinates; maximum number of coordinates is 25."
#> Error: Too many coordinates; maximum number of coordinates is 25.

For what it's worth, nothing changes if you set allow_large_matrix = TRUE. Separately, thank you so much for adding the allow_large_matrix feature; it's much appreciated!

# Doesn't change behavior even if allowing for large matrices
tx_counties %>% wrap_fxn(n_orgs = 21, n_dest = 5, allow_large = TRUE)
#> ℹ Origins: 21   Destinations: 5 (coord_size = 26)
#> Splitting your matrix request into smaller chunks and re-assembling the result.
#> [1] "Too many coordinates; maximum number of coordinates is 25."
#> Error: Too many coordinates; maximum number of coordinates is 25.

As mentioned above, the other situations work beautifully

### Works as expected
tx_counties %>% wrap_fxn(n_orgs = 2, n_dest = 2)
#> ℹ Origins: 2   Destinations: 2 (coord_size = 4)
#>          [,1]     [,2]
#> [1,] 136.9000 122.3417
#> [2,] 128.2617 141.3400

tx_counties %>% wrap_fxn(n_orgs = 26, n_dest = 1)
#> ℹ Origins: 26   Destinations: 1 (coord_size = 27)
#> Splitting your matrix request into smaller chunks and re-assembling the result.
#>            [,1]
#>  [1,] 122.34167
#>  [2,] 141.34000
#>  [3,]  71.34167
#>  [4,] 106.35167
#>  [5,]  96.06167
#>  [6,]  94.03333
#>  [7,]  91.24000
#>  [8,]  39.23000
#>  [9,]  74.01833
#> [10,] 108.16000
#> [11,] 147.31333
#> [12,] 108.65667
#> [13,]  68.54500
#> [14,]  84.33167
#> [15,] 123.49000
#> [16,]  38.55500
#> [17,]  51.11833
#> [18,]  90.91500
#> [19,]  66.07333
#> [20,]  74.53333
#> [21,]  46.03000
#> [22,]  92.71167
#> [23,]  37.31000
#> [24,]  66.92833
#> [25,]  82.86167
#> [26,]  68.54333

Created on 2023-09-02 with reprex v2.0.2

hmeleiro commented 1 year ago

I'm having the same issue, but it seems to be a Mapbox API limit.

Maximum 25 input coordinates per request

If you need more than that, maybe you can use DBSCAN to cluster origins or destinations points so that you meet the API limits as explained here.

walkerke commented 1 year ago

Thanks for the heads up and detailed reprex! This is now fixed.