deepkit / deepkit-framework

A new full-featured and high-performance TypeScript framework
https://deepkit.io/
MIT License
3.24k stars 124 forks source link

Add a simplified version of join that's executed on the client side (mimics .populate in mongoose) #84

Open Rush opened 3 years ago

Rush commented 3 years ago

Rationale: At least for mongoose it appears that executing simple queries such as "fetch me all resources of author X and populate author X in the results" actually takes longer to run on the MongoDB side than executing two separate queries and performing the join client-side.

Mongoose: Find with Populate: 50.675ms
Mongoose: Find with Populate (lean): 11.296ms
Mongoose: Find with aggregate: 22.23ms
DeepKit: Find with joinWith: 19.082ms
DeepKit: Find simulating mongoose strategy: 4.241ms

I performed some benchmarks with a query fetching 136 entities, each having an author field being a reference to the users collection.

The performance gain here will be almost 5x.

Some example use cases can be found here: https://mongoosejs.com/docs/populate.html

marcj commented 3 years ago

As commented by @Rush in Slack, we could add with this .populate() feature also unnormalized relations support, e.g. @t.arrray(user).reference() users: User[], which is typical use-case in Mongodb, and would be supported for all databases by executing two queries.

Rush commented 3 years ago

A stretch goal would be to support some more complex cases of populate. I was able to find few of such use cases in my codebase.

  1. Populate references that are deeply nested in arrays or objects.
    const item = await entity.populate('collaborators.user').execPopulate();

    the mongoose schema for Entity is:

    type: [{
      user: {
        type: SchemaTypes.ObjectId,
        ref: 'User',
      },
      accessInfo: {
        type: Number,
      },
    }],

Plain data interpretation is:

{
  _id: '60010fcc50d08d8d08857cfd',
  collaborators: [ { accessInfo: 0, user: '60010fbc50d08d8d08857cfb' },  { accessInfo: 0, user: '60010fc250d08d8d08857cfc' }, ],
}
  1. Deep populate that can recursively execute after first level of population completes

    const memberOfTeams = await TeamMembersDB.find({ ..query }).populate({
    path: 'team',
    select: fieldsFromTeamOwner,
    populate: {
      path: 'owner',
      select: fieldsFromTeamOwner,
    },
    });
  2. Run populate on an existing object.

    const entity = EntityModel.findOne({ _id: '60010fcc50d08d8d08857cfd' }); // finds entity
    await entity.populate('author');

Perhaps in Deepkit, this could work like this:

const entity = await db.query(Entity).filter({ _id: '5f924ea504ec0f1e75e669cc', }).findOne();
// run some business logic. Business logic may conditionally require 'author' to be present
await db.populate(entity, 'author'); // on demand "join" :)
theodorDiaconu commented 3 years ago

@Rush this is basically what I just done. I have managed to integrate @deepkit/bson and @deepkit/type into a very thin query layer that works exactly like Mongoose populate.

The results were very good, I didn't get the 5x, but I think this is due to how many documents you fetch, the more, the faster. We've got an "almost-free" 40% increase in speeds, and we were already 3x faster than mongoose.

Kudos to @marcj for these amazing packages. It's still hard to believe it can be this fast. If you would be interested in leveraging nova into your deepkit/mongo, I'll gladly help.