Closed giko45 closed 3 years ago
Thanks for getting in touch. I am not so sure what you problem is. If a column contains NAs then the column in the database handels them as Null values. Therefore, the column is marked as Nullable.
Hi, Thanks for the response.
Do you have any suggestions on how should i proceed to debug the issue?
ts <- DBI::dbConnect(RClickhouse::clickhouse(), host='127.0.0.1')
x= data.frame(id=1,a='asd',b=NA,c=NA,d=NA)
dbWriteTable(ts, 'test21', x )
Error in dbDataType(clickhouse(), obj) : Invalid argument to dbDataType: must not be NULL or all NA
seesionInfo() RClickhouse_0.4.1
some more tests:
Tests with creating table in dbWriteTable: is OK
> dbWriteTable(ts, 'test', data.frame(id=c(1:2),a=c(NA,'ff')) ,overwrite = T)
OK: │ CREATE TABLE timedb.test (`id` Int32, `a` Nullable(Enum16('ff' = 0))) ENGINE = TinyLog │
Tests pre existing table and updating /adding rows with dbWriteTable fails when one column only has NA
> dbExecute(ts, statement = paste0('DROP TABLE IF EXISTS test'));
> dbExecute(ts, 'CREATE TABLE test (`id` Int32, `a` Nullable(String)) ENGINE = TinyLog ');
> dbWriteTable(ts, 'test', data.frame(id=c(1:2),a=c(NA,'ff')) ,overwrite = F,append=T)
> dbWriteTable(ts, 'test', data.frame(id=c(1:2),a=c(NA,NA)) ,overwrite = F,append=T)
Show Traceback
Rerun with Debug
Error in insert(conn@ptr, qname, value) :
cannot write R type 10 to column of type String
Update:
dbWriteTable(ts, 'test', data.frame(id=c(1:2),a=as.character(c(NA,NA)) ) ,overwrite = F,append=T)
works, so i suppose i do not have an issue any more :)
The last insert of a data.frame with one column with only NA should work when the table already exits since the column type is already known in the table.
Note that
> dbWriteTable(ts, 'test', data.frame(id=c(1:2),a=c(NA,NA)) ,overwrite = T)
Show Traceback
Rerun with Debug
Error in dbDataType(clickhouse(), obj) :
Invalid argument to dbDataType: must not be NULL or all NA
also failes but that's OK since the column type of column cannot be determained
@inkrement any update on this? I get the same error whenever attempting to insert a single row with some NA values.
Hi, I know that the current situation is not really satisfying. But we have to admit that it was not our first priority. Databases such as Clickhouse are built for really big data sets (sometimes so big that you even have to distribute it over multiple machines). In contrast, R is often limited by clear memory restrictions. Therefore, we assumed, that most people use this package to analyse data which already resists in the database and do not use RClickhouse for importing it.
However, sometimes it is quite useful to add small data sets for some in-database operations such as joins. In addition they recently introduced really nice features such as external data sources and we are really eager to look into it. Furthermore, we already planned to revise the interface for creating new tables and inferring data types.
It's always hard to provide a proper time estimate, however, you can expect some enhancements before xmas.
Note that if you make sure that the column with NA has a datatype it will work
dbWriteTable(ts, 'test', data.frame(id=c(1:2),a=c(NA,NA)) ,overwrite = F,append=T)
Failes
dbWriteTable(ts, 'test', data.frame(id=c(1:2),a=as.character(c(NA,NA)) ) ,overwrite = F,append=T)
works
Big thank you to @giko45 for the suggestion of the workaround.
@giko45 @JSchoenbachler After 993d8ae, rch now supports inserting a data.frame of only NA-values into a column of an already created table without explicitly specifying the datatypes.
If the changes don't work for you as intended please feel free to reopen the issue
@giko45 It doesn't work for me as you suggested.
If I create the table first using dbExecute(ts, 'CREATE TABLE test (
idInt32,
aNullable(String)) ENGINE = TinyLog ')
everything is fine, but creating it on the fly is not working:
dbWriteTable(ts, 'test', data.frame(id=c(1:2),a=as.character(c(NA,NA)) ) ,overwrite = F,append=T)
I expected dbWriteTable(ts, 'test', data.frame(id = c(1:2), a = c(NA, NA)), overwrite = FALSE, append = TRUE)
is not working, because c(NA, NA)
is of type logical. You could write c(NA_character_, NA_character_)
to tell R, that the NA's should be treated as character here.
My task would be to copy data from one DB (Impala) to another one (Clickhouse). Would be fine to keep the scheme.
dbDataType(ts, "test")
returns a simple string instead of the expected (named) character vector containing the types of the columns (which are the names)
sessionInfo()
R version 4.0.3 (2020-10-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 10 (buster)
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/atlas/libblas.so.3.10.3
LAPACK: /usr/lib/x86_64-linux-gnu/atlas/liblapack.so.3.10.3
locale:
[1] LC_CTYPE=de_AT.UTF-8 LC_NUMERIC=C
[3] LC_TIME=de_AT.UTF-8 LC_COLLATE=de_AT.UTF-8
[5] LC_MONETARY=de_AT.UTF-8 LC_MESSAGES=de_AT.UTF-8
[7] LC_PAPER=de_AT.UTF-8 LC_NAME=de_AT.UTF-8
[9] LC_ADDRESS=de_AT.UTF-8 LC_TELEPHONE=de_AT.UTF-8
[11] LC_MEASUREMENT=de_AT.UTF-8 LC_IDENTIFICATION=de_AT.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] microbenchmark_1.4-7 RSQLite_2.2.0 nycflights13_1.0.1
[4] implyr_0.3.0.9001 RJDBC_0.2-8 rJava_0.9-13
[7] dbplyr_2.0.0 dplyr_1.0.2 DBI_1.1.0
[10] RClickhouse_0.5.2
loaded via a namespace (and not attached):
[1] Rcpp_1.0.5 rstudioapi_0.11 magrittr_1.5 tidyselect_1.1.0
[5] bit_4.0.4 R6_2.5.0 rlang_0.4.8 fansi_0.4.1
[9] blob_1.2.1 tools_4.0.3 utf8_1.1.4 cli_2.1.0
[13] withr_2.3.0 ellipsis_0.3.1 digest_0.6.27 yaml_2.2.1
[17] bit64_4.0.5 assertthat_0.2.1 tibble_3.0.4 lifecycle_0.2.0
[21] crayon_1.3.4 purrr_0.3.4 vctrs_0.3.4 memoise_1.1.0
[25] glue_1.4.2 compiler_4.0.3 pillar_1.4.6 generics_0.1.0
[29] pkgconfig_2.0.3
Hi,
I get the following error:
Error in insert(conn@ptr, qname, value) : count of elements in nested and nulls should be the same
when inserting a data.table with 1 row with some NA values
All columns that have NA's are Nullable in the table definition.
I do not understand what i need to do to solve this.
Regards