Automattic / mongoose

MongoDB object modeling designed to work in an asynchronous environment.
https://mongoosejs.com
MIT License
26.89k stars 3.83k forks source link

Aggregate data in Populate Virtual #8344

Open m-weeks opened 4 years ago

m-weeks commented 4 years ago

Do you want to request a feature or report a bug? feature

What is the current behavior? Populate virtuals only allow for a simple query with a few conditions, but not for an aggregate query

What is the expected behavior? It would be great to be able to fetch aggregate data in a populate virtual. For example, if I have a Hotel schema:

var HotelSchema = new Schema({
  name: {
    type: String,
    required: true,
  }
});

And a hotel has many rooms:

var RoomSchema = new Schema({
  roomNumber: {
    type: String,
    required: true,
  },
  price: {
    type: Number,
    required: true,
  }
});

If I wanted to get the lowest priced room for a hotel (As you might often see on a booking website: from $109.99/night), I would need to do an aggregate query on Hotel model:

Hotel.aggregate([
  {
    $lookup: { /* room stuff */ }
  },
  {
    $addFields: {
      minPrice: { $min: 'rooms.price' },
    },
  },
]);

But the problem with this is you would lose a lot of nice mongoose functionality for Hotels, just because you want to add the single field (ex: no virtuals, no populating other collections without using $lookup).

So Ideally it would be nice if you could define this aggregate calculation somehow in a virtual, and simply call Hotel.find({}).populate('minPrice')

What are the versions of Node.js, Mongoose and MongoDB you are using? Note that "latest" is not a version. Node: 10.16.0 Mongoose: 5.7.11 MongoDB: 4.0.6

vkarpov15 commented 4 years ago

Why don't you just do Hotel.find().populate('rooms') and then have a virtual minPrice that loops through all the rooms and finds the minimum price? Once you have all the rooms, calculating minPrice in Node.js is an easy synchronous operation, no need for aggregation.

m-weeks commented 4 years ago

This was just a simple example, I could see people having some more complex calculations, no?

I'm fairly new to MongoDB, how would the performance compare looping through all the rooms compared the the aggregate calculation (especially if doing thousands+ of hotels, and hundreds of rooms)? Coming from SQL it's habit to always do calculations on the database side if possible

vkarpov15 commented 4 years ago

In my experience I prefer the opposite - calculations in the app rather than in the database. Stateless apps are easier to scale horizontally than a database.

We'll keep this open for the future, but for your case I would recommend using a virtual

m-weeks commented 4 years ago

@vkarpov15 Thanks! This week I tried it out and in my case it seems to have saved a bit of time

roopakv commented 6 months ago

@vkarpov15 in more complex cases it would be nice to have virtuals in populated objects from an aggregate pipeline.

We are going through an aggregate pipeline for a collection say A. We need to sort A based on a field in B where A.bType has an _id of B.

In this case aggregations are the best way to get data with sorting (unless i am mistaken).