mdsumner / spdplyr

Data Manipulation Verbs for Spatial Classes
http://mdsumner.github.io/spdplyr/
42 stars 5 forks source link

Support additional dplyr functions (*_join, distinct etc) #11

Open jlegewie opened 7 years ago

jlegewie commented 7 years ago

Came across a couple of other dplyr functions that don't work. This includes the full set of *_join functions (I think only left_join and inner_join work but not semi_join, anti_join etc. distinct also doesn't work. Below are some examples.

library("maptools")
library("spdplyr")

data(wrld_simpl)
DF <- wrld_simpl@data %>% transmute(NAME, a = 1)

wrld_simpl %>% left_join(DF, by = "NAME")
wrld_simpl %>% inner_join(DF, by = "NAME")
wrld_simpl %>% right_join(DF, by = "NAME")
wrld_simpl %>% full_join(DF, by = "NAME")
wrld_simpl %>% semi_join(DF, by = "NAME")
wrld_simpl %>% anti_join(DF, by = "NAME")

wrld_simpl %>% distinct(REGION)
mdsumner commented 7 years ago

I can't see how to make distinct work without .keep_all being TRUE. Also, sorry for the massive delay here.

I'm cheating now by adding a row number, and assuming .keep_all = TRUE maintains that as an index to determine which Spatial*parts to keep. If you have any ideas I'll pursue.

I think we can make any of the joins work by keeping a silent row index, but I need to also see how other packages approach this.

mdsumner commented 7 years ago

I've decided to use a diy method for distinct, it'll give the same answers as duplicated(), i.e. the same as distinct mostly apart from some cases of numeric data.