This issue concerns the introduction.Rmd script in the vignettes directory.
The zips dataset has several documents with identical _id values. The
mongo.insert.batch(mongo, "rmongodb.zips", res) statement fails with an error message in the server window regarding a duplicate _id. I got is to run by inserting the following line:
myzips <- zips[ !duplicated( zips[,"_id"]), ]
and substituting myzips for zips.
Also, using a for loop is a relatively slow way to create all the bson values stored in the variable res. It takes 5.55 seconds on my laptop (including the mongo.insert.batch function call). The process can be sped up by first using an apply to convert the list matrix into a list:
res <- lapply( myziplist, function(x) mongo.bson.from.list(x) )
This takes 1.28 seconds on my laptop (including the mongo.insert.batch function call).
One more small point. It is unnecessary to check for the MongoDB connection using
if(mongo.is.connected(mongo) == TRUE)
since mongo.is.connected(mongo) returns the value TRUE if there is a connection
if(mongo.is.connected(mongo))
is sufficient.
Thanks much for working on this library. I think it may offer a database solution for an R application I'm developing. I'll be testing the potential over the next few weeks.
Thanks a lot for the feedback. I fixed all the issues and they are online in version 1.6.2 on github.
I fixed the import issues in the vignettes. Duplicated _ids are a key feature of the zips data set. I renamed the colname _id to "orig_id". In this case there will be no problem for mongodb.
I improved mongo.insert.batch to report the correct error message
if(mongo.is.connected(mongo) == TRUE) is more clear for not R experts coming from the mongoDB world ;-)
This issue concerns the introduction.Rmd script in the vignettes directory.
The zips dataset has several documents with identical _id values. The mongo.insert.batch(mongo, "rmongodb.zips", res) statement fails with an error message in the server window regarding a duplicate _id. I got is to run by inserting the following line:
and substituting myzips for zips.
Also, using a for loop is a relatively slow way to create all the bson values stored in the variable res. It takes 5.55 seconds on my laptop (including the mongo.insert.batch function call). The process can be sped up by first using an apply to convert the list matrix into a list:
and then using lapply to create res
res <- lapply( myziplist, function(x) mongo.bson.from.list(x) )
This takes 1.28 seconds on my laptop (including the mongo.insert.batch function call).
One more small point. It is unnecessary to check for the MongoDB connection using
if(mongo.is.connected(mongo) == TRUE)
since mongo.is.connected(mongo) returns the value TRUE if there is a connection
if(mongo.is.connected(mongo))
is sufficient.
Thanks much for working on this library. I think it may offer a database solution for an R application I'm developing. I'll be testing the potential over the next few weeks.