Closed jfdesomzee closed 2 years ago
This is actually a quirk of data.table
, and not something I can fix unfortunately. I would just wrap your output in as.double()
to avoid this.
Here's a stack overflow example.
However it looks like this might be fixed in the development version of data.table
as this no longer errors on my machine. I couldn't find a NEWS bullet on their github repo, but it might be worth downloading that and see if it works for you.
library(tidytable, warn.conflicts = FALSE)
#> As of tidytable v0.9.0 dotless versions of functions are exported. You can now use `arrange()`/`mutate()`/etc. directly.
iris %>%
as_tidytable() %>%
mutate(SL=ifelse(Species=="setosa",NA,Sepal.Length)) %>%
summarise(SP_Large=sum(base::ifelse(Sepal.Length > 5, Sepal.Length, 0L)),
.by=Species)
#> # A tidytable: 3 × 2
#> Species SP_Large
#> <fct> <dbl>
#> 1 setosa 117.
#> 2 versicolor 282.
#> 3 virginica 324.
Well this would actually be worst for me as the value of SP_Large is not NA for setosa. In case the condition of the ifelse is NA it should not return the TRUE expression.
The example I gave is different than the one you gave because I don't have your default()
function.
The values should return correctly when running your example.
Oh sorry my mistake. The default function is actually the fix and should not have been in the example. With the default function tidytable and dplyr return both the same. The issue is without the default function; In that case dplyr returns NA while tidytable returns the error.
Ah gotcha.
So I can reproduce your error now:
library(tidytable, warn.conflicts = FALSE)
#> As of tidytable v0.9.0 dotless versions of functions are exported. You can now use `arrange()`/`mutate()`/etc. directly.
iris %>%
as_tidytable() %>%
mutate(SL=ifelse(Species=="setosa",NA,Sepal.Length)) %>%
summarise(SP_Large=sum(base::ifelse(SL>5,Sepal.Length,0)),
.by=Species)
#> Error in `[.data.table`(~.df, , .(SP_Large = sum(base::ifelse(SL > 5, : Column 1 of result for group 2 is type 'double' but expecting type 'integer'. Column types must be consistent for each group.
and this would be the recommended fix:
library(tidytable, warn.conflicts = FALSE)
#> As of tidytable v0.9.0 dotless versions of functions are exported. You can now use `arrange()`/`mutate()`/etc. directly.
iris %>%
as_tidytable() %>%
mutate(SL=ifelse(Species=="setosa",NA,Sepal.Length)) %>%
summarise(SP_Large=as.double(sum(base::ifelse(SL>5,Sepal.Length,0))),
.by=Species)
#> # A tidytable: 3 × 2
#> Species SP_Large
#> <fct> <dbl>
#> 1 setosa NA
#> 2 versicolor 282.
#> 3 virginica 324.
The error is caused in the execution engine of data.table
, and unfortunately there's no way for me to build in a workaround.
Thanks. This is indeed another work arround. I hope the condition NA won't happens to often in my code and I'll adpat using default or as.double depending on the cases.
Also I would highly recommend using the reprex package to help with reproducible examples.
Basic steps:
1) Type out your example code (including any calls to library()
)
2) Copy it to your clipboard (like you're going to paste it somewhere)
3) Run reprex::reprex()
4) Paste into the github issue (it will be automatically formatted to markdown with code output)
It also makes sure your examples will be reproducible on my end. For example - it would have caught that the default()
function wasn't there. It runs everything in a "clean" environment.
If you run into any issues let me know.
Feel free to give reprex
a test run in this issue if you want.
Hello,
When summarising rows with condition, I have sometimes error because of NA condition
This is returning an error while dplyr equivalent is not.
It's easy to adapt the code to avoid the error but I think dplyr behavior makes more sense.