Closed jorainer closed 4 years ago
Thank you for the bug report. Tagging @Liubuntu
HI @jorainer
This error is actually due to the inability of mutate.tbl_lazy*
defined in dbplyr
in adding new columns with fixed-length values. The SQLDataFrame is internally using dbplyr
in lazily representing the database tables. The @tblData
slot is of class tbl_lazy
, and the mutate.SQLDataFrame
directly calls the mutate.tbl_lazy
for processing.
I can recreate your error. But the mutate.tbl_lazy
actually works when the value is of length 1, and operations of existing columns. Sorry that the function documentation is not clearly stating this situation, I'll modify it (updated in 1.1.1).
> tblData(sdf)
# Source: table<test_table> [?? x 3]
# Database: sqlite 3.30.1
# [/var/folders/7t/9l4kkf_j2sqbpn321y9g5558z96ck_/T//RtmpkrJIF9/file162c922eebc11]
pkey letters numbers
<int> <chr> <dbl>
1 1 a -0.941
2 2 b 1.76
3 3 c 1.58
4 4 d -1.28
5 5 e -2.77
> sdf_2 <- mutate(sdf, new_col = c("b", "a", "b", "r", "g"))
> sdf_2
SQLDataFrame with 5 rows and 3 columns
Error: row value misused
library(dplyr)
tibble(dframe) %>% mutate(new_col = c("b", "a", "b", "r", "g"))
# A tibble: 5 x 2
dframe$pkey $letters $numbers new_col
<int> <chr> <dbl> <chr>
1 1 a -0.941 b
2 2 b 1.76 a
3 3 c 1.58 b
4 4 d -1.28 r
5 5 e -2.77 g
> library(dbplyr)
> dat <- tbl(con, "test_table")
> dat %>% mutate(new_col = c("b", "a", "b", "r", "g"))
Error: row value misused
> dat %>% mutate(new_col = "a")
# Source: lazy query [?? x 4]
# Database: sqlite 3.30.1
# [/var/folders/7t/9l4kkf_j2sqbpn321y9g5558z96ck_/T//RtmpkrJIF9/file162c922eebc11]
pkey letters numbers new_col
<int> <chr> <dbl> <chr>
1 1 a -0.941 a
2 2 b 1.76 a
3 3 c 1.58 a
4 4 d -1.28 a
5 5 e -2.77 a
> dat %>% mutate(new_col = numbers)
# Source: lazy query [?? x 4]
# Database: sqlite 3.30.1
# [/var/folders/7t/9l4kkf_j2sqbpn321y9g5558z96ck_/T//RtmpkrJIF9/file162c922eebc11]
pkey letters numbers new_col
<int> <chr> <dbl> <dbl>
1 1 a -0.941 -0.941
2 2 b 1.76 1.76
3 3 c 1.58 1.58
4 4 d -1.28 -1.28
5 5 e -2.77 -2.77
> dat %>% mutate(new_col = numbers > 0)
# Source: lazy query [?? x 4]
# Database: sqlite 3.30.1
# [/var/folders/7t/9l4kkf_j2sqbpn321y9g5558z96ck_/T//RtmpkrJIF9/file162c922eebc11]
pkey letters numbers new_col
<int> <chr> <dbl> <int>
1 1 a -0.941 0
2 2 b 1.76 1
3 3 c 1.58 1
4 4 d -1.28 0
5 5 e -2.77 0
> sdf_2 %>% mutate(new_col = numbers > 0)
SQLDataFrame with 5 rows and 3 columns
pkey | letters numbers new_col
<integer> | <character> <numeric> <integer>
1 | a -0.9409161 0
2 | b 1.7561005 1
3 | c 1.5790539 1
4 | d -1.2770926 0
5 | e -2.7703379 0
Also I want to mention that there is a lighter version of SQLDataFrame
to use and is currently saved as the "reimplement" branch of this repo. I am planning to submit this after the new release to replace the current SQLDataFrame
package. https://github.com/Bioconductor/SQLDataFrame/tree/reimplement
If you are using SQLDataFrame for simply representing SQLite tables (or Google BigQuery, MySQL), I would recommend this lighter version. If your work involves cross-database operations, i.e., joining two MySQL tables with different user credentials, you can use the current version. After submitting the new package, I might rename the current version as SQLDataFrameComplex
.
Thanks for your explanations @Liubuntu !
What I was actually looking for is a way to add additional columns (with arbitrary values) to a (SQL)DataFrame or to replace existing columns. Will this be possible with the new implementation?
Hello,
The current and reimplemented versions don't support adding or replacing a column with arbitrary values right now, since they are basically rely on dbplyr for the existing functions. However, I think it is possible to add this feature through some SQL procedures which was used somehow in the current version, but unfortunately I won't be able to investigate further very soon.
Best, Qian
On Fri, Apr 17, 2020, 7:43 AM Johannes Rainer notifications@github.com wrote:
Thanks for your explanations @Liubuntu https://github.com/Liubuntu !
What I was actually looking for is a way to add additional columns (with arbitrary values) to a (SQL)DataFrame or to replace existing columns. Will this be possible with the new implementation?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Bioconductor/SQLDataFrame/issues/3#issuecomment-615199496, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADAYVSIJLTP5KIUDAS7AFADRNA6FPANCNFSM4MII2CHA .
Thanks for the feedback!
Dear developers, I wanted to add columns to an existing
SQLDataFrame
but that does not seem to work:Seems this error is thrown by SQLite. Thankful for any comment on this.
My session info: