jeroen / mongolite

Fast and Simple MongoDB Client for R
https://jeroen.github.io/mongolite/
284 stars 64 forks source link

Trying to insert entry with additional ObjectId fails #237

Open kamilsi opened 2 years ago

kamilsi commented 2 years ago

In e.g. MongoDB Compass you can insert a document that contains additional ObjectId. For example this JSON:

{"second_id": {"$oid": null}}

Will be translated to a document with two ObjectIds generated on the DB side, e.g:

{
    "_id": {
        "$oid": "623b324877c4f70061f4b44a"
    },
    "second_id": {
        "$oid": "623b324877c4f70061f4b449"
    }
}

This is handy if you are updating a document and want to have a unique identifier that changes after the update. Unfortunately, you cannot reproduce this with mongolite:

mongo$insert('{"second_id": {"$oid": null}}')
Error: Invalid read of null in state IN_BSON_TYPE
jeroen commented 2 years ago

@kevinAlbs is there a way to create such a query with mongo-c-driver?

kevinAlbs commented 2 years ago

@kevinAlbs is there a way to create such a query with mongo-c-driver?

Not directly. {"second_id": {"$oid": null}} is not valid Extended JSON so I doubt any drivers support parsing that JSON directly.

The only way insert an autogenerated ObjectID other than the _id in the C driver is to initialize one explicitly:

bson_t *to_insert = bson_new ();
bson_oid_t oid;
bson_oid_init (&oid, NULL /* context */);
bson_append_oid (to_insert, "second_id", -1, &oid);
if (!mongoc_collection_insert_one (coll, to_insert, NULL /* opts */, NULL /* reply */, &error)) {
    MONGOC_ERROR ("error in mongoc_collection_insert_one: %s", error.message);
    return EXIT_FAILURE;
}
MONGOC_DEBUG ("insert OK");
bson_destroy (to_insert);

Here is a runnable example.

jeroen commented 2 years ago

@kamilsi it looks like this is difficult to do with the R bindings. You will have to manually set the second_id field to match the one from _id. For example I tested this works in R:

col <- mongo()
col$drop()
col$insert(iris)
iter <- col$iterate(query = '{}', fields = '{"_id":1}')
while(length(rec <- iter$one())){
  selector <- sprintf('{"_id":{"$oid":"%s"}}', rec[['_id']])
  second <- sprintf('{"$set":{"second_id": {"$oid": "%s"}}}', rec[['_id']])
  col$update(selector, update = second)
}
out <- col$find(fields = '{}')
View(out)
Wesseldr commented 2 years ago

Converting string _id's into ObjectId()'s in a dplyr data frame.

Inserting a dplyr table with mongolite$insert, I used the following code to finally got it working (took me months... but this works).

tbl <- imported_excel_tbl %>% 
        mutate(
            clientcontract_id = list(list("$oid" = clientcontract_id)) 
         )
coll$insert(tbl, auto_unbox = TRUE)

So key elements are using the double list like `list(list("$oid" = clientcontract_id))' and the auto_unbox = TRUE with the insert else it won't work...

Hope this helps others as well to insert an ObjectId() with a dataframe or a dplyr frame :-)

JWR

lisovyk commented 2 years ago

@Wesseldr I tried your code, but it inserted the field as string for me

If you want to create a new document in a collection that has a custom objectID field – you have to:

No news on that?

..at least it works, I guess!

lisovyk commented 6 days ago

finding myself here again while searching for a way to generate object id, i came up with this function which works for me and successfully inserts objectid into the table:

library(digest)

generate_object_id <- function() {
  timestamp <- as.integer(Sys.time())
  machine_id <- substr(digest::digest(runif(1)), 1, 6)
  process_id <- sprintf("%04x", as.integer(Sys.getpid()) %% 65536)
  counter <- sprintf("%06x", as.integer(runif(1, 0, 16777216)))

  object_id <- paste0(
    sprintf("%08x", timestamp),
    machine_id,
    process_id,
    counter
  )

  return(object_id)
}

object_id <- generate_object_id()
print(object_id)