Closed javierluraschi closed 5 years ago
@javierluraschi thanks a lot, I will try my best to follow the guideline.
Currently, the geospark with Spark 2.4 is under development. I will follow the latest edition if geospark scala master is updated.
I update geospark to 1.2.0 which fix lots bugs and add some new SQL functions. Here is the release note: https://github.com/harryprince/geospark/releases,
in that this PR might need to update the geospark scala package version.
@javierluraschi
in this PR, the dplyr
related example does not work.
And I added another example, see more discussion in #9
library(dplyr)
polygons_wkt <- mutate(polygons_wkt, y = st_geomfromwkt(geom))
points_wkt <- mutate(points_wkt, x = st_geomfromwkt(geom))
sc_res = full_join(polygons_wkt %>% mutate(dummy=TRUE) %>% compute(),
points_wkt %>% mutate(dummy=TRUE) %>% compute(),
by = "dummy") %>%
filter(sql("st_contains(y,x)")) %>%
group_by(area, state) %>%
summarise(cnt = n())
sc_res %>%
head()
@harryprince this is such a great extension! Nice job putting it together! I couldn't help myself to play with it and send some improvements for you to consider:
dplyr
example to README, move data files to data folder and config/benchmarks to their own section.We are also adding your extension to spark.rstudio.com, you can probably consider publishing to CRAN at some point, it's already quite useful!