danzafar / tidyspark

tidyspark: a tidyverse implementation of SparkR built for simplicity, elegance, and ease of use.
Other
22 stars 0 forks source link

Feature/if else #54

Closed estern95 closed 4 years ago

estern95 commented 4 years ago

ifelse mvp closed #39

estern95 commented 4 years ago

Made some more edits. Got case_when working with docs and testing. This closes #20.

danzafar commented 4 years ago

SO we are pretty close here, and I might just close it, but it looks like all these are passing except those that deal with n() and ONLY when I run all the tests at once. If I run them individually or using devtools::test_file they run fine. I think the issue is probably not related to this feature though.

danzafar commented 4 years ago

tests not passing was of no fault of this feature addition, but i went ahead and did some housekeeping to get everything passing and more streamlined. Will close this.

danzafar commented 4 years ago

ok i decided to go ahead and fix the error handling using the spark_class function. I also discovered @estern95 that this function does not work for aggregates, like if you are getting the max(<some_col>) < <some_value>. We should probably fix that and then add some more tests and we should be good2go.

danzafar commented 4 years ago

OK I fixed error handling and made the tests bomb. As of now, if_else and case_when do not handle aggregates, meaning an expression like this will not work:

spark_tbl(iris) %>%
  mutate(z = if_else(max(Petal_Width) > 3, T, F) %>%
  collect

But I have gone to great lengths so that the error is informative and provides a nice workaround.