Re-enable Model::sync - Githubissues

thedodd commented 4 years ago

Wither started out with application-level automatic index synchronization capabilities via the Model::sync method. As of mongodb@1.0, index management has not yet been implemented in the driver, and thus, as of the wither@0.9.0-alpha.0 release, the index syncing features have been disabled.

If this is a feature you enjoyed, used, or would otherwise like to have once again, please upvote this issue.

If this is something you don't particularly care for, and you have other potentially better approaches to handling these sorts of tasks, please share your thoughts and what tools you would recommend.

Thank you in advance for any participation!

huming2207 commented 4 years ago

Hello @thedodd ,

Just wondering without this feature, will it stop me from implementing a text search function? Especially for creating the index, like this: https://docs.mongodb.com/manual/text-search/#text-index ?

Or alternatively, is there a simple workaround for that?

Regards, Jackson

thedodd commented 4 years ago

@huming2207 you will still be able to implement that functionality just fine. The only thing which has been temporarily removed here is the ability to sync indexes from your Wither-based app. You can still manage your indexes on your mongo clusters directly.

huming2207 commented 4 years ago

@thedodd Thanks for your explanation. I will try it out later.

simoneromano96 commented 4 years ago

They should have implemented the missing functionality with tests: https://github.com/mongodb/mongo-rust-driver/pull/31/files#diff-58a8d444ea45eaad0f057db47b1a021a

Is this what was blocking this issue?

thedodd commented 4 years ago

@simoneromano96 if you take a look at the docs and the code for 1.0.0, you will see that the code has been removed.

simoneromano96 commented 4 years ago

Yeah, you're right; for now, I'm manually adding the indexes like this if anyone has the same need as me:

    let db = client.database("dbName");
    db.run_command(
        doc! {
            "createIndexes": "collection",
            "indexes": [
                {
                "key": {
                    "name": 1,
                },
                "name": "nameIndex",
                "unique": true,
                },
            ],
        },
        None,
    )
    .await?;

thedodd commented 3 years ago

@simoneromano96 thanks for dropping the code snippet above. Do you have any interested in opening a PR which implements an abstraction over the pattern you've shown above? Basically something which we could use in the Model::sync system and reactivate that functionality? No pressure, but I figured I would ask :).

On that note, @saghm do you all have any plans on adding the index management functionality back into the driver? I seem to recall a discussion where you mentioned that the team was considering doing so. Cheers

simoneromano96 commented 3 years ago

@simoneromano96 thanks for dropping the code snippet above. Do you have any interested in opening a PR which implements an abstraction over the pattern you've shown above? Basically something which we could use in the Model::sync system and reactivate that functionality? No pressure, but I figured I would ask :).

On that note, @saghm do you all have any plans on adding the index management functionality back into the driver? I seem to recall a discussion where you mentioned that the team was considering doing so. Cheers

Ok, I think I can try to, I'll probably need some help with the derive macros, I don't use them a lot so I'm not an expert there.

thedodd commented 3 years ago

Fortunately, you shouldn't need to do any actual derive macro work as part of this. It would just be annotating the Model default implementation methods, adding the dependency to the Cargo.toml ... and that should be it.

saghm commented 3 years ago

On that note, @saghm do you all have any plans on adding the index management functionality back into the driver? I seem to recall a discussion where you mentioned that the team was considering doing so. Cheers

Hello! I brought it up with the team this morning after seeing your message yesterday, and we agreed that there's been enough interest in it for us to prioritize working on it. I can't give an exact date for when we expect to have it implemented and released, but we expect to start working on it fairly soon. There are a couple of API details we'll need to figure out before we get started, but writing the actual code should be fairly straightforward.

thedodd commented 3 years ago

Glad to hear it! Thanks for the update.

thedodd commented 3 years ago

@simoneromano96 per our earlier discussion (via email) there was one outstanding issue which needed to be addressed in the previous implementation described over in #30. I think we should definitely take that into account. In short, we need to compare the keys and the options of indices in our reconciliation algorithm.

Those are really the only considerations I would bring up. We definitely want to reiterate in the documentation how important it is that folks consider the impact of index changes. One bad index change could cripple an entire MongoDB cluster at scale (I've seen it happen plenty of times before). But we can knock out the docs once we get the business logic merged into master.

simoneromano96 commented 3 years ago

@simoneromano96 per our earlier discussion (via email) there was one outstanding issue which needed to be addressed in the previous implementation described over in #30. I think we should definitely take that into account. In short, we need to compare the keys and the options of indices in our reconciliation algorithm.

Those are really the only considerations I would bring up. We definitely want to reiterate in the documentation how important it is that folks consider the impact of index changes. One bad index change could cripple an entire MongoDB cluster at scale (I've seen it happen plenty of times before). But we can knock out the docs once we get the business logic merged into master.

I could use the listIndexes command: https://docs.mongodb.com/manual/reference/command/listIndexes/#dbcmd.listIndexes

With the exception of the collation option, if you create an index with one set of index options and then try to recreate the same index but with different index options, MongoDB will not change the options nor recreate the index. To change the other index options, drop the existing index with db.collection.dropIndex() before running createIndexes with the new options.

So I will get the current indexes, diff them with the ones from the model, drop the ones that already exist and are similar and recreate them.

simoneromano96 commented 3 years ago

Hi, I just wanted to drop in the progress I made so far, it looks all good but it is giving me an error with Mongo v.4.4.0 when I try to create a new index without a name.

The 'name' field is a required property of an index specification

Did wither generate indexes name automatically? If that is the case I would use the key (or keys) of the IndexModel, this should also help me in the diffing.

You can see what I did until now in my fork (I think we should also try to understand how to make some tests, I'm worried about text indexes and when a model has a lot of indexes because listIndexes gives me a cursor and I'm not sure on how I should use it to fetch the other ones).

I await your feedback. (feedback.await?)

thedodd commented 3 years ago

@simoneromano96 you rock! I'll take a look at your fork soon. Feel free to open a Draft PR as well if you would like. Might make discussion a bit easier as well.

The implementation of the sync_model_indexes from 0.8.x is here: https://github.com/thedodd/wither/blob/0.8.0/wither/src/model.rs#L385-L450 That worked really well. The one thing that needed to be updated for that implementation is that the index diffing was too naive. As I mentioned earlier, we need to also compare the index options.

Did wither generate indexes name automatically?

Negative. After reviewing the lease notes for 4.4, it looks like the behavior is still the same as it has been. Perhaps your implementation was passing a null value for the name, which I do believe is illegal. I'll give it an eye once you are able to open a PR, if that is all the same with you.

Also, thank you for hacking on this!

simoneromano96 commented 3 years ago

@simoneromano96 you rock! I'll take a look at your fork soon. Feel free to open a Draft PR as well if you would like. Might make discussion a bit easier as well.

Done, #61

The implementation of the sync_model_indexes from 0.8.x is here: https://github.com/thedodd/wither/blob/0.8.0/wither/src/model.rs#L385-L450 That worked really well. The one thing that needed to be updated for that implementation is that the index diffing was too naive. As I mentioned earlier, we need to also compare the index options.

Did wither generate indexes name automatically?

Negative. After reviewing the lease notes for 4.4, it looks like the behavior is still the same as it has been. Perhaps your implementation was passing a null value for the name, which I do believe is illegal. I'll give it an eye once you are able to open a PR, if that is all the same with you.

Maybe it is only needed for createIndexes raw command?

Also, thank you for hacking on this!

Don't worry, I love coding in Rust and I also like MongoDB so I'm happy to be able to help!

Actually this is my first open-source collaboration, since I became a freelancer I have more time so I want to contribute as much as I can.

simoneromano96 commented 3 years ago

https://docs.mongodb.com/manual/reference/command/createIndexes/

I'm 99% sure that this command requires a name for the index, all the optional parameters have "optional" in the description

thedodd commented 3 years ago

@simoneromano96 yea, you are correct. The algorithm used for automatically generating names on indices when a name is not provided explicitly is pretty simple. The algorithm used is described in the MongoDB docs as well. Long-term, we may want to take a different approach, but if an index defined as part of the #[derive(Model)] attribute macro does not include a name field in the opts, then we can auto generate the name. We shouldn't have to change the macro at all.

Thoughts?

saghm commented 3 years ago

In case this is helpful, the spec that the driver will follow when we implement index management describes generating index names here. The spec also requires that we allow users of the API to specify names for indexes if they don't want the defaults though, so assuming you update this to use our eventual implementation, you'll still be free to use whatever naming scheme you want by just always passing in a value for the name.

simoneromano96 commented 3 years ago

In case this is helpful, the spec that the driver will follow when we implement index management describes generating index names here

Thanks, I had a similar idea for autogenerating the names so this will come in handy!

simoneromano96 commented 3 years ago

@simoneromano96 yea, you are correct. The algorithm used for automatically generating names on indices when a name is not provided explicitly is pretty simple. The algorithm used is described in the MongoDB docs as well. Long-term, we may want to take a different approach, but if an index defined as part of the #[derive(Model)] attribute macro does not include a name field in the opts, then we can auto generate the name. We shouldn't have to change the macro at all.

Thoughts?

I'll finish working on it tomorrow then (It's 23PM now where I live); I will finish the implementation then I saw that other tests are failing so I'll check those out.

I'll try to recreate the same tests as before (v0.8.0) then we could work on the docs.

simoneromano96 commented 3 years ago

Hi, @thedodd what do you think is the viable minimal testing I should do (I'm talking about the automated tests)?

Right now I'm trying to solve a cursor problem (listIndexes gives a cursor so I must iterate over, for now I was just using the firstBatch)

thedodd commented 3 years ago

Hey @simoneromano96, there were some tests for the index syncing mechanism back in the 0.8.x line (and earlier versions as well). I think our test cases might be as simple as:

given a pristine state, the index syncing mechanism should create the expected indexes on the target collections based on the Model's indexes (coming from the #[derive(Model)] system, as that is easiest).
ensure that old indexes are removed when they are no longer defined on a model.
ensure that an old index is updated correctly when one of the indexes options are updated.

simoneromano96 commented 3 years ago

@thedodd I believe the PR is ready to be merged, I made all the tests I could think of

thedodd / wither

Re-enable Model::sync #51