SierraSoftworks / Iridium

A high performance MongoDB ORM for Node.js
http://sierrasoftworks.github.io/Iridium/
Other
569 stars 25 forks source link

Index Question/Odd Behavior #123

Closed CatGuardian closed 6 years ago

CatGuardian commented 6 years ago

Hi Benjamin,

I mainly have a question about how to use indices with Iridium.

So going off of your main documentation I see that you used an @Index and I thought to myself: that is cool, we can just specify our indices on our Instance declaration and Iridium will take care of making sure they are set on the database.

However I just discovered that is not quite the case. It seems like using the decorator only declares that the index is there. And if I want to actually add the indices to the database, I have to do a manual call on the Model: UserModel.ensureIndexes()

Was it your intention to force the user to make this call manually?

I had expected based on Main documentation--and further convinced by the UserModel example https://github.com/SierraSoftworks/Iridium/blob/master/example/UserModel.ts, that Iridium would have automatically created the indices during the Core.connect() method.

So that way before the Core is actually connected, all of the indices I declared would be active. One way we could accomplish this is to have the Core class have a list of functions that need to be called after the MongoDB.MongoClient is successfully connected to. We could call it onConnectActions or something. And then we make it public (or have a addOnConnectAction(...) method). That way the base Model class's constructor could do this this._core.addOnConnectAction(() => this.ensureIndex()); // I think the '() => this.ensureIndex()' is needed to bind Model's this to the ensureIndex method. And then the first time Core.connect() is called, it will make sure all of the onConnectActions are performed before resolving.

The onConnectActions is good as a generic function list because then it can be used in the future to make Iridium that much more powerful by enabling Iridium to wrap more functionality for the users of Iridium.

One more thing to consider but I don't know if we would want to support it or not: if the Model is constructed (new MyModel(...)) after the Core is already connected, then the indices declared for that custom Model wouldn't be executed. So we could have the base Model class have an init(): Promise<...> method which would reject if the Core isn't connected. The init method would then add the indices, if they weren't already added by the Core during its connect If we introduced the init method, then we would want to update all of the Model methods/accessors that requires a connection to the database to reject if init hasn't been called yet. Then that will tell the programmer that you forgot to call init, which they would easily fix by calling init.

While I was typing the last idea, I just realized we wouldn't have to modify Core at all if we just introduced the init method to the Model and require it be called before any access to the database is done. Then the init method could be responsible for handling the creation of the indices on the database.

That turned out to be a longer post than I meant it to be. In summary, I asked a couple questions in the beginning so please respond to those. And then I talked about a proposal to make index declaring as easy as using the decorator.

notheotherben commented 6 years ago

Right, so to answer your first question ("was it intentional? (and if so, why?)") the answer is "Yes". The intention was to make it easy to declare indices as part of your models but to still leave it up to the user to explicitly decide when they were created. The reasoning behind this is driven primarily out of experience running large database systems which are performance sensitive - adding a new index to the live service during peak hours (as a result of a developer rolling out a new version) and not having control over this can lead to some major problems. So rather than having that behaviour as opt-out (and risking people making that mistake) I decided to make it opt-in so that they would need to make the conscious decision to do so (and hopefully think about the ramifications).

As for your second, your suggestion for an easy way to automatically create them for the other 99% of use cases is exactly why the onConnected hook exists on the Iridium.Core. Rather than having a list of functions to be executed on connection, it is left up to the developer to call them in whatever order they see fit (and with whatever parallelism they wish). This gives a bit more control over exactly how the process happens and even allows you to defer execution of some tasks (such as only running index creation in off-peak times or during maintenance windows).

class MyCore extends Iridium.Core {
  MyModel = new Iridium.Model<MyDoc, MyModel>(this, MyModel)
  async onConnected() {
    await MyModel.ensureIndexes()
  }
}

I hope that helps address the issue you're running into, you'll also find an example of this in the new README file here.

CatGuardian commented 6 years ago

Hmmmm, I never thought about that. You are right about that though. Wouldn't want indexes declared during peak hours.

How would you recommend doing indexes then? If I would want to write a mongoDB script and execute that manually, what would be the benefit of declaring the index in my code base using @Index?

CatGuardian commented 6 years ago

Also, thanks for the point to the new README file, that is awesome. I'll have to read that because as a new user of Iridium that will probably answer a lot of my questions.

notheotherben commented 6 years ago

So the approach I've usually taken with a number of my larger systems is having an Operations API that allows you to conduct various operational tasks (from cleanly restarting a single process all the way up to having your services respond with a 500 Server Error for all requests should you need to take your service offline at short notice). One of those tasks would be updating your DB indexes and you would call that API method at the correct time (ideally automatically through something like Ansible or a Kubernetes CronJob depending on your environment and process).

Granted, a lot of that is overkill for smaller projects, so my suggested "best practice" for most people is simply using the onCreated() hook to create your indexes, with the understanding that if you do grow to a point where doing so poses a risk to your performance then you will need to look at moving it elsewhere - but at least it's nice and obvious.

No problem at all, out of data "user" documentation (as opposed to API documentation) has been a sore point on this project for a while, so I'm trying to address that with the 8.x release branch to improve the onboarding experience. I actually think it would be more useful to invert the current README structure (showing how the final thing is used first and then building up from there rather than the other way around) and there's certainly room for some more in-depth topics of discussion (like plugins, property schemas, working with binary data, the various hooks and when they are triggered etc.).

CatGuardian commented 6 years ago

Sure, I think you should have something up front which is really easy to read. Not sure what that means to you or how to do that.

But then what I would really be interested in, is "user" documentation by feature. And the very first thing at the top of the README should be a Table of Contents with anchor links. What do I mean by 'documentation by feature'? I mean like, Using Indexes would be an entry in the table of contents which would anchor down to the Using Indexes section which would get in to all of the details including the decorator, what the decorator is doing (its not actually putting indexes on the database, and how to actually make the indexes get created on the database (talk about ensureIndexes, that it can be used in an onConnect for the Core, and that to be careful about using the onConnect due to the peak hours concern thing. Or other things like that.

Other 'Features' I would like (off the top of my head, meaning not an all inclusive list): 'Using Transforms', 'Using Validators'. 'Nesting Documents/Objects', 'Using Indexes', 'Connecting to Iridium with Express'.

Hope that helps some. It does sound like a lot of work though.

By the way, you can close this issue whenever you would like.

notheotherben commented 6 years ago

Yeah, navigatability is a big factor and while a table of contents is one solution, I'm actually strongly considering putting together some form of website which tries to better segment that information and allows for more content to be provided without making the document unmaintainable. The previous README file ended up growing to the point where making modifications was far more challenging than it needed to be and I'd really like to avoid that happening again.

Thanks for the list of topics you'd like to see, I'll definitely keep those in mind when putting together the docs. It certainly will be a lot of work, but I'm sure it'll be worth it.