jeroen / mongolite

Fast and Simple MongoDB Client for R
https://jeroen.github.io/mongolite/
284 stars 64 forks source link

Should values in a list be BSON array when inserted? #198

Open koheiw opened 4 years ago

koheiw commented 4 years ago

@jeroen Thanks for the great package. I am using your package to introduce my students to MongoDB. I am filing this issue because I was not sure if this is the expect behavior or not.

The question is should values save by insert(x) differ between data.frame and list? As you can see in the screen shot of MongoDB Compass, created_at is date only when x is a data.frame. When x is a list it becomes a BSON array. The code below is taken from the package website and modified to highlight the issue.

require(mongolite)
#> Loading required package: mongolite
mydata <- jsonlite::fromJSON("https://api.github.com/repos/jeroen/mongolite/issues")
mydata$created_at <- strptime(mydata$created_at, "%Y-%m-%dT%H:%M:%SZ", "GMT")

mydata1 <- mydata[1,c("id", "created_at")]
str(mydata1)
#> 'data.frame':    1 obs. of  2 variables:
#>  $ id        : int 593332715
#>  $ created_at: POSIXlt, format: "2020-04-03 12:00:54"
mylist2 <- as.list(mydata[2,c("id", "created_at")])
str(mylist2)
#> List of 2
#>  $ id        : int 589033383
#>  $ created_at: POSIXlt[1:1], format: "2020-03-27 10:27:19"
mylist3 <- as.list(mydata[3,c("id", "created_at")])
str(mylist3)
#> List of 2
#>  $ id        : int 550382553
#>  $ created_at: POSIXlt[1:1], format: "2020-01-15 19:26:20"

issues <- mongo("issues")
issues$insert(mydata1) # date
#> List of 5
#>  $ nInserted  : num 1
#>  $ nMatched   : num 0
#>  $ nRemoved   : num 0
#>  $ nUpserted  : num 0
#>  $ writeErrors: list()
issues$insert(mylist2) # BSON array
#> List of 6
#>  $ nInserted  : int 1
#>  $ nMatched   : int 0
#>  $ nModified  : int 0
#>  $ nRemoved   : int 0
#>  $ nUpserted  : int 0
#>  $ writeErrors: list()
issues$insert(mylist3, auto_unbox = TRUE) # BSON array
#> List of 6
#>  $ nInserted  : int 1
#>  $ nMatched   : int 0
#>  $ nModified  : int 0
#>  $ nRemoved   : int 0
#>  $ nUpserted  : int 0
#>  $ writeErrors: list()

Screenshot_20200413_185257

I though I can change how insert() behaves for a list if auto_unbox = TRUE, but I did not see any difference.

jsonlite::toJSON(mylist3, auto_unbox = FALSE)
#> {"id":[550382553],"created_at":["2020-01-15 19:26:20"]}
jsonlite::toJSON(mylist3, auto_unbox = TRUE)
#> {"id":550382553,"created_at":"2020-01-15 19:26:20"}
jeroen commented 4 years ago

You are correct, this is a bug. The issue is jsonlite not honoring auto_unbox icw/ POSIXt="mongo":

mylist <- list(x = "foo", today = Sys.time())
jsonlite::toJSON(mylist, POSIXt = 'mongo')
# {"x":["foo"],"today":[{"$date":1586800884247}]} 
jsonlite::toJSON(mylist, POSIXt = 'mongo', auto_unbox = TRUE)
# {"x":"foo","today":[{"$date":1586800884247}]} 

I'll try to fix it. One workaround is to explicitly unbox() the element:

mylist <- list(x = "foo", today = jsonlite::unbox(Sys.time()))
jsonlite::toJSON(mylist, POSIXt = 'mongo')
# {"x":["foo"],"today":{"$date":1586800912819}}

Or alternatively you can insert the data as a data frame, or as literal json.

koheiw commented 4 years ago

Thanks @jeroen for quick response. I will use data.frame for workaround in my teaching!