KxSystems / rkdb

R client for kdb+
https://code.kx.com/q/interfaces
Apache License 2.0
41 stars 29 forks source link

Tables with column type C create data.frames with List columns #3

Closed gerrymanoim closed 7 years ago

gerrymanoim commented 7 years ago

I'm unclear whether this is the expected behavior or not, but it wasn't what I expected to get back in R.

In q:

q)meta tt
c        | t f a
---------| -----
sym      | s
AvgPx    | j   s
stringSym| C
> library(rkdb)
> qcon <- rkdb::open_connection(port=6660)
> tt <- rkdb::execute(qcon, "tt")
> str(tt)
'data.frame':   5 obs. of  3 variables:
 $ sym      : chr  "A" "B" "C" "D" ...
 $ AvgPx    : num  10 20 30 40 50
 $ stringSym:List of 5
  ..$ : chr "A"
  ..$ : chr "B"
  ..$ : chr "C"
  ..$ : chr "D"
  ..$ : chr "E"

As opposed what I expected:

> tt$stringSym <- unlist(tt$stringSym)
> str(tt)
'data.frame':   5 obs. of  3 variables:
 $ sym      : chr  "A" "B" "C" "D" ...
 $ AvgPx    : num  10 20 30 40 50
 $ stringSym: chr  "A" "B" "C" "D" ...
sv commented 7 years ago

It is expected and consistent with other interfaces to kdb+. stringSym is a char array(not a symbol) that is why it gets represented as a vector of char arrays. There is a duality between the two. Would you expect all char arrays to be converted to plain R strings?

gerrymanoim commented 7 years ago

I agree that there's a duality between the two in q, but because R has strings, I would expect char arrays to be converted to R strings. If there are multiple char arrays in each row, then I would expect a list type (with multiple elements in each list item).

I disagree that it is consistent with other interfaces to kdb+. Take the json interface for example:

q).j.j tt
"[{\"sym\":\"A\",\"AvgPx\":10,\"stringSym\":\"A\"},\n {\"sym\":\"B\",\"AvgPx\..
q)ttjson:.j.j tt
q)f:{.h.hy[`json;ttjson]}
q)ttjson
"[{\"sym\":\"A\",\"AvgPx\":10,\"stringSym\":\"A\"},\n {\"sym\":\"B\",\"AvgPx\..
q).z.ph:f

Which in R is converted the way I expect:

> r <- httr::GET(url = "localhost:6660/?ttjson" )
> ttjson <- httr::content(r,as = "text",type = "application/json")
> ttjson <- jsonlite::fromJSON(ttjson)
> str(ttjson)
'data.frame':   5 obs. of  3 variables:
 $ sym      : chr  "A" "B" "C" "D" ...
 $ AvgPx    : int  10 20 30 40 50
 $ stringSym: chr  "A" "B" "C" "D" ...

I believe a similar thing is true if I save this to csv (and then load) or use the python integration.

sv commented 7 years ago

I was talking about java and c# interfaces. Makes sense to convert to strings - i will give it a go and update here

sv commented 7 years ago

Could you try https://github.com/sv/rkdb and see if that works for you? should be able to install via devtools::install_github('sv/rkdb')

gerrymanoim commented 7 years ago

That's perfect, thanks!