H2CO3 / avocado

Strongly-typed MongoDB driver for Rust
MIT License
81 stars 2 forks source link

How to upsert #1

Closed max-frai closed 5 years ago

max-frai commented 5 years ago

Hello, the crate is amazing but the docs are missing somewhere. For example, I have a vector of struct which implements Doc. Now I want to upsert_many: insert if does not exist and update content if exists. Should I really need to create another struct-operation, implement Upsert trait for it and use it for this operation? Maybe there is a simple way for this?

H2CO3 commented 5 years ago

Hi, and thanks for contributing!

Every type, macro, and method is documented in the crate. The way to use upsert_many() is indeed to create a type that implements Upsert. Convenience default implementations are not provided for this particular trait because it's ambiguous what should be done:

If you are still just experimenting with your database logic and are not willing to create a semantic upsertion type, remember that you can always just drop down to the underlying, untyped API of mongodb using e.g ThreadedDatabase::collection():

let db = client.db("my_database");
let raw_coll = db.collection(<MyEntityType as Doc>::NAME);
// operate on `raw_coll` from here on
max-frai commented 5 years ago

Every type, macro, and method is documented in the crate.

Yep, sorry, I described it wrong. It's just a new way for me to use mongodb in rust in such object-oriented way and sometimes it's not clear how to make things right. Thanks for the description.

H2CO3 commented 5 years ago

Ah no worries 🙂 The misunderstanding was on my part. Hope you find my reply useful; if you have any further questions, feel free to ask.

max-frai commented 5 years ago

Sorry for reopening. So I thought about this and there are some questions. In docs I found that upsert_many updates many and inserts only one and insert_many just takes iterable and insert everything. So is it the right way to use upsert_many to update vector of items? I just need to replace in database items and sometimes their data would be changed. For now, I iterate vector of structures which implements Doc and call upsert_entity for them. It's correct logic but also it's slow because no batching applied. I tried to implement Upsert trait for my structure:

// Match is the structure  with Doc

#[derive(Debug, Clone)]
pub struct UpsertMatch<'a> {
    _id: &'a Uid<Match>,
    replacement: &'a Match,
}

impl<'a> Upsert<Match> for UpsertMatch<'a> {
    fn filter(&self) -> Document {
        doc! { "_id": self._id }
    }

    fn upsert(&self) -> Document {
        doc! {
            "$set": self.replacement // ????
        }
    }
}

The questions are:

H2CO3 commented 5 years ago

I think you might have a misunderstanding here.

So is it the right way to use upsert_many() to update vector of items?

The upsert_many() method is the right one to use if and only if you need the following behavior:

From what I can tell, you don't want this behavior. What I think you are looking for is: specify a bunch of conditions, and for each such condition, upsert exactly one document. Unfortunately, as far as I can tell, MongoDB provides no API for performing this operation directly.

max-frai commented 5 years ago

Okay, you are right. I understand update_many wrong. So I need upsert_entity in the iteration. Are there any ways to integrate batching with your methods?

p.s. and for future, how would I convert full structure into bson type to replace it, for example?

H2CO3 commented 5 years ago

You can convert any type implementing Serde's Serialize trait (hence, all Docs) into BSON by using the bson::to_bson() function.


Are there any ways to integrate batching with your methods?

Since it's MongoDB itself that doesn't seem to support batched upserts on several independent conditions, I don't think I can do anything better about it, sorry.

A quick search on Stack Overflow brings up some suggestions about first collecting (separately) the documents to be updated and inserted based on each of your search criteria, and then applying either one batch update and one batch insert operation, or better yet a bulk_write() with a bunch of WriteModels. Note however that this approach is not concurrency-safe, as the database might be modified by other clients between the query and the write(s).

This problem could be solved by transactions, which is only supported by MongoDB 4. I have yet to try whether the underlying raw mongodb crate works with MondoGB 4.x (and in particular whether it can successfully and correctly issue transaction handling commands), but if it works, I can add a convenience method to Collection, for wrapping the logic described above in a one-shot, transaction-safe manner.