program-- / fipio

An R :package: for lightweight FIPS code information retrieval
https://fipio.justinsingh.me
Other
14 stars 1 forks source link

Error when assigning coordinates to five-number FIP code #15

Closed MarisaBSzubryt closed 7 months ago

MarisaBSzubryt commented 8 months ago

Hello,

I have been having some difficulty getting the fipio package to work properly for some (but not all of my datasets). I'm trying to take a CSV of coordinates (latitude, longitude) into five-number FIPS codes. I then would split those five number codes into the first two numbers for the state and last three numbers for the county/parish, then converting them to state/county names for automatic county-level map production. In some of the datasets I use, I get this error when running the following code:

fipio::coords_to_fips(x = data.frame(points_longitude[i], points_latitude[i], coords = c("X", "Y"))) [1] Error in order(ret_index) : unimplemented type 'list' in 'orderVector1'

It only happens for some CSV files or dataframes, and it doesn't seem to be a case of illegal characters in the coordinates. If I enter in the coordinates manually, it seems fine with that (usually), but bringing in a CSV downloaded from somewhere else usually (but not always) causes this error to pop up. I've spoken with one of my professors who's very familiar with R, and he couldn't figure out what's going on. The latitude and longitude are converted to vectors or numeric format prior to running this line of code (see below), so I think that sometime during the fipio::coord_to_fips() function, something is converted to a list (and should be converted back to a vector but isn't always the case). Why it only happens sometimes I'm not sure, and I wasn't sure where else to ask this sort of question.

points_longitude <- points$longitude; points_longitude <- as.numeric(points_longitude); points_longitude; points_latitude <- points$latitude; points_latitude <- as.numeric(points_latitude); points_latitude

Any thoughts on how to alleviate the issue would be greatly appreciated! Files (one works, one doesn't work) are below, along with the rest of the code, Marisa

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

coordinates.csv example_coordinates_simple.csv

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

# Code written by Marisa Szubryt 2024/01/07, modified from links below

# https://cran.r-project.org/web/packages/fipio/readme/README.html
# https://rdrr.io/cran/fipio/f/README.md
# https://www.reddit.com/r/Rlanguage/comments/se5w7b/how_to_replace_comma_that_is_not_followed_by_a/ 
# https://search.r-project.org/CRAN/refmans/totalcensus/html/convert_fips_to_names.html 
# https://search.r-project.org/CRAN/refmans/usdata/html/abbr2state.html
# https://stackoverflow.com/questions/13093931/remove-last-word-from-string 
# https://stackoverflow.com/questions/72408598/r-extract-first-two-characters-from-a-column-in-a-dataframe 
# https://stackoverflow.com/questions/18115550/combine-two-or-more-columns-in-a-dataframe-into-a-new-column-with-a-new-name 

# Set working directory and clear environment
setwd("/Users/mbs/Desktop/Heterotheca/Heterotheca-RCode/Coordinate_county_maps")
rm(list=ls()); graphics.off(); gc()

# Load in map-making packages for coordinate to county conversion
install.packages('dyplr')
install.packages('fipio')
install.packages('stringr')
install.packages('totalcensus')
install.packages('usdata')
library(dplyr)
library(fipio)
library(stringr)
library(totalcensus)
library(usdata)

# Read in CSV with coordinates
# The below CSV works
points <- read.csv('example_coordinates_simple.csv'); attach(points); points; names(points)

# The below CSV doesn't work 
# points <- read.csv('coordinates.csv'); attach(points); points; names(points)

# Add in necessary columns
points$fips<-'';points$state_fips<-'';points$county_fips<-'';points$state_name<-'';points$county_name<-''; points$SC_name<-''

# Extract latitude and longitude
points_longitude <- points$longitude; points_longitude <- as.numeric(points_longitude); points_longitude
points_latitude <- points$latitude; points_latitude <- as.numeric(points_latitude); points_latitude

# Add FIPS codes (2 numbers for state, 3 numbers for county)
fips_codes <- c()
for (i in c(1:length(points_longitude))) {
  fips_iteration <- fipio::coords_to_fips(x = data.frame(points_longitude[i], points_latitude[i], coords = c("X", "Y")))
  fips_codes <- c(fips_codes, fips_iteration)}
fips_codes <- fips_codes [!is.na(fips_codes)]; fips_codes
points$fips <- fips_codes; points$fips <- as.character(points$fips); points$fips; points

# Split 5 number FIPS codes into state and county fip codes
points$state_fips<- substr(points$fips,1,2); fips_states <- points$state_fips; fips_states <- as.character(fips_states); fips_states
points$county_fips<- substr(points$fips,3,5); fips_counties <- points$county_fips; fips_counties <- as.character(fips_counties); fips_counties

# Generate state and county names from FIPS codes 
states_names <- convert_fips_to_names(fips_states); states_names
counties_names <- convert_fips_to_names(fips_counties, states=states_names, geo_header='COUNTY'); counties_names

# Add state/county names into dataframe
points$state_name <- states_names; points$county_name <- counties_names; points

# Convert state abbreviations to full name
points$state_name <- abbr2state(points$state_name); points

# Remove the 'County' portion of the counties_names vector
points <- points %>% mutate(county_name=word(county_name, 1, -2)); points

# Convert state and county to lower case
points$state_name <- tolower(points$state_name); points$county_name <- tolower(points$county_name); points

# Convert the state and county names into a format usable by the county-highlighting code
points$SC_name <- paste(points$state_name, ",", points$county_name); points
points$SC_name <- gsub(" , (?! )", ",", points$SC_name, perl=TRUE); points

# Extract out the list of "state,county" to insert into the county-highlighting code
plantCounties <- points$SC_name; plantCounties

# Export the amended dataframe for reference
write.csv(points, "example_coordinates_simple_amended.csv", row.names=FALSE)

###########################################################################

# Code written by Patrick Alexander (2024/01/02), modified by Marisa Szubryt

# Load in map-making packages
install.packages('ggplot2')
install.packages('maps')
install.packages('sf')
install.packages('tidyverse')
library(ggplot2)
library(sf)
library(maps)
library(tidyverse)

# Run code to automatically set up map-making
states <- st_as_sf(maps::map("state", plot = FALSE, fill = TRUE))
counties <- st_as_sf(maps::map("county", plot = FALSE, fill = TRUE))
plantStates <- substring(plantCounties, 1, regexpr(",", plantCounties) - 1) %>% unique()
counties <- counties %>% 
  mutate(state=substring(ID, 1, regexpr(',', ID)-1)) %>%
  filter(state %in% plantStates)
plantStates <- states %>% filter(ID %in% plantStates)
plantCounties <- counties %>% filter(ID %in% plantCounties)
bbox <- st_bbox(plantStates)
zone <- floor(((mean(c(bbox$xmin,bbox$xmax))+180)/6) +1)
projBbox <- st_bbox(st_transform(plantStates, crs = paste('+proj=utm +zone=',zone,' +datum=WGS84', sep='')))

# Can change thickness/color of state/county bounaries and fill color of selected counties
ggplot(NULL) + geom_sf(data=counties, lwd=0.33, color='black', fill='white') +
  geom_sf(data=plantCounties, lwd=0.33, fill='steelblue') +
  geom_sf(data = states, lwd=0.75, color='black', fill='transparent') +
  coord_sf(crs=paste('+proj=utm +zone=',zone,' +datum=WGS84',sep=''),
           xlim=c(projBbox$xmin, projBbox$xmax), ylim=c(projBbox$ymin, projBbox$ymax), expand=TRUE) +
  theme_void()
program-- commented 8 months ago

Hi @MarisaBSzubryt, thanks for bringing this up. Based on that error message, yeah it looks like ret_index in the coord_to_fips.numeric method has a list column that is causing the error when order is called on it. This is most likely an issue with my internal ray casting algorithm... it's not super well-written haha.

I'm currently working on a rewrite of the internal fipio components to support more FIPS information (i.e. tracts, ZCTAs, etc.) so that might delay a fix, but I will try to get something out soon.


For (my) reference, the relevant source code causing the issue is here:

https://github.com/program--/fipio/blob/21f9fc97e2d4e8aa1a1fb997e802254730f4251f/R/geolocate.R#L90-L103

My initial, rough guess is that one of these two variables is causing the issue:

https://github.com/program--/fipio/blob/21f9fc97e2d4e8aa1a1fb997e802254730f4251f/R/internals.R#L28-L43

MarisaBSzubryt commented 8 months ago

Hi Justin,

Thanks for getting back to me; I really appreciate your help with this and the R package you've created! Do you know when the updates might be published? No rush on my end or anything, was just curious if you had an estimated time when it'll be revised. I believe that I sent the dataset which does work (along with the one that doesn't) - let me know if it would be helpful for me to re-send either/both if you need some test data.

Have a nice weekend! Marisa

On Mon, Jan 15, 2024 at 2:44 PM Justin Singh-M. - NOAA < @.***> wrote:

Hi @MarisaBSzubryt https://github.com/MarisaBSzubryt, thanks for bringing this up. Based on that error message, yeah it looks like ret_index in the coord_to_fips.numeric method has a list column that is causing the error when order is called on it. This is most likely an issue with my internal ray casting algorithm... it's not super well-written haha.

I'm currently working on a rewrite of the internal fipio components to support more FIPS information (i.e. tracts, ZCTAs, etc.) so that might delay a fix, but I will try to get something out soon.

For (my) reference, the relevant source code causing the issue is here:

https://github.com/program--/fipio/blob/21f9fc97e2d4e8aa1a1fb997e802254730f4251f/R/geolocate.R#L90-L103

— Reply to this email directly, view it on GitHub https://github.com/program--/fipio/issues/15#issuecomment-1892745466, or unsubscribe https://github.com/notifications/unsubscribe-auth/BFLT2IXUIZVUURCI3NOM3ZTYOWIJPAVCNFSM6AAAAABB3X53Z6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJSG42DKNBWGY . You are receiving this because you were mentioned.Message ID: @.***>

program-- commented 7 months ago

Hi @MarisaBSzubryt, I'm going to put out a patch for this, since I think I've figured it out. Just need to do a bit more logic work -- I'd say expect a new version up on CRAN by Wednesday at the latest.


The reason for the error is that in fipio::coords_to_fips.numeric, the general idea is to:

  1. Get all county FIPS and geometry
  2. Filter counties based on whether coordinates from x/y are within the bounding box of each county geometry
  3. Perform ray casting on each county geometry to find which coordinates intersect that geometry (so, now we have a list of length(lookup_geometry) with indices to x/y that associate a FIPS to the coordinate, where some may be NA)

Step (3) leaves us with two variables, ret_value and ret_index -- ret_value is a character(length(lookup_geometry)) with FIPS codes; ret_index is a list with the indices as mentioned, i.e.

r$> ret_value
[1] "04003" "04019" "04023" "35013" "35029" "48141" "48243" "48377"

r$> ret_index
[[1]]
[1]  7  9 12 13 15

[[2]]
[1] 10

[[3]]
[1] 8

[[4]]
[1] 16

[[5]]
[1] 14

[[6]]
[1] 11

[[7]]
[1] 2 3 4 5 6

[[8]]
[1] 1

I'm not sure exactly what changed, but I'm going to guess some R version modified behavior somewhere that caused either .intersects (an internal function for fipio) or sapply to result a list in this case. (or more likely, I wrote some code that shouldn't have been written how it was haha).

Regardless, I think I have a fix. I'll also comment once it's up on GitHub so you can install it before it's on CRAN if needed.

program-- commented 7 months ago

@MarisaBSzubryt I was able to get the change pushed and up to CRAN today actually. fipio v1.1.2 should be up on CRAN now so the issue should be fixed now. Let me know if it's good to close this issue, thanks!

MarisaBSzubryt commented 7 months ago

Justin, thank you so much! The program works perfectly now. The fipio command works like a charm now

program-- commented 7 months ago

@MarisaBSzubryt Glad to hear, and no problem! Thanks again for bringing up the issue. If any other issues come up, feel free to let me know. 😃