Closed aichholzer closed 5 years ago
This is by design. Mongoose doesn't do any extra copying; the way that populate works is that we do an extra query, and then assign the documents you're populating to the result of that query. We don't make copies of the returned docs, so if you have one populate with multiple docs pointing to the same id, you'll get the same reference. There are a few workarounds, for instance, just check if _id is defined.
Open to discussion about changing this behavior. Reason against it is mostly performance.
@vkarpov15 thank you for your reply, I figured as much for performance reasons. However, I think this should be optional and left to the developer, after all he is the one who needs to make decisions based on his actual needs. For instance, I would like to suggest something like this:
.populate([
{
path: 'user',
select: 'username about'
}
], { copy: true })
where copy
defaults to false
. This would make it possible to ask Mongoose to explicitly do the extra copying instead of using a reference.
By the way; just checking if _id
is defined does not really solve the problem in case the object is undergoing a more complex set of transformations, at least in my case. Plus, the first modified object would have to be kept around so it can be copied in case _id
does no longer exist in the subsequent objects.
It's definitely possible to add the 'copy' option. Will put it in for next release, but would welcome pull requests :)
šš¼
@vkarpov15 are you thinking just to add an option to 'copy' but leave the default behavior? I hope you do leave the default behavior.
It's an important aspect of any ODM that references are the same document when working with multiple populated graphs of documents. Performance alone is not the only reason for this behavior.
That's a reasonable argument. I think this might be easier to add as a plugin, but there's no plans to change the default in the foreseeable future.
Hello everyone, sorry if I'm bringing this issue back to life after so long but I've stumbled on this behavior (on v5.1.5)
I have a populated doc with a date, that must be formatted according to a timezone in the parent element. So for example: doc A (timezone: Europe/Rome), subdoc X (date: 2018-10-04T15:00:00.000Z) doc B (timezone: America/New_York), subdoc X (date: 2018-10-04T15:00:00.000Z)
It's not possible to solve this by simply checking if the date is already formatted, since I really want formatted_date to be different on each subdoc... of course I can simply assign formatted_date to the parent but the field would be in a "wrong" position inside the object! So the only way is to copy the subdoc before operating on it.
Are there plans to introduce the { copy } option? I guess I understand what Jeff says but if the default remains { copy: false } there should be no differences, am I missing something? Is a PR welcome or it's already decided that this functionality should go in a plugin?
@andreafdaf I'm not sure I understand your issue, can you elaborate with code samples? Dates in MongoDB are just numbers, JavaScript is responsible for all date formatting, so you should be able to handle date formatting with a virtual
@vkarpov15 Sorry, my example wasn't really clear... talking about dates and timezones was just a specific use case, let me illustrate with some pseudojs:
// models
const Event = new Schema({
start: { type: Date }
})
const User = new Schema({
local_timezone: { type: String }
})
const Reservation = new Schema({
user: { type: Schema.ObjectId, ref: 'User' },
event: { type: Schema.ObjectId, ref: 'Event' },
})
Suppose we have:
Bob and Alice have a reservation for Conference Call, so let's say we want to send them an email with a reminder telling when the meeting is going to take place:
const reservations = await Reservation.find(...).populate('user event').exec()
for (const reservation of reservations) {
const { user } = reservation
const { event } = reservation // this is always the same object since it's a reference
event.formatted_date = moment.tz(event.start, user.local_timezone).format()
await sendEmail(reservation)
}
In this case event.formatted_date will be the same in both emails, to fix this we must:
const reservations = await Reservation.find(...).populate('user event').exec()
for (const reservation of reservations) {
const { user } = reservation
let { event } = reservation // this is always the same object since it's a reference
event = JSON.parse(JSON.stringify(event))
event.formatted_date = moment.tz(event.start, user.local_timezone).format()
reservation.event = event
await sendEmail(reservation)
}
This is a simple example, but geez we had some bugs already caused by ignoring that populated objects were refs and not distinct copies of the same data
As pointed out by the OP, it would be nicer if we could do this:
const reservations = await Reservation.find(...).populate('user event', { copy: true }).exec()
for (const reservation of reservations) {
const { user } = reservation
const { event } = reservation // this is not the same object for each reservation
event.formatted_date = moment.tz(event.start, user.local_timezone).format()
await sendEmail(reservation)
}
or
const Reservation = new Schema({
user: { type: Schema.ObjectId, ref: 'User' },
event: { type: Schema.ObjectId, ref: 'Event', options: { copy: true } },
})
I hope it's a little bit clearer this way :)
It's not big deal, sure... we could simply assign formatted_date to reservation and we're done without changing anything, but maybe some parts of the code want formatted_date to be under event; a new dev might ignore this fact and trigger bugs anew until it understands where the issue comes from; during refactor those extra object copying parts might be seen as "overcomplication" and thrown away if not properly commented...
I know those are not really good arguments, the answer for all of them is "code better" but you know, bad coding happens and having a way to consistently prevent errors from happening is always nice
As always, thanks for your time! :v:
Thanks for the detailed explanation @andreafdaf , that was very helpful. I have a much better understanding of the why of this issue now
It's always a pleasure to contribute @vkarpov15
I'm open to keep watching/discussing the implementation if you like to have a couple of eyes more to look at it. I would PR something but I don't think I have enough knowledge of mongoose internals to do a good job :cry:
I'm facing with issue while using s3 bucket , what is best practice I was doing chat app this app contain 3 collection Chats,Messages,Users and User I find messages at group and of possible to same user's message in group in this case when populate like that I'm having trouble getting user avatars signed proccess this is my query.
const messages = await Message.find({ chat: chatId })
.sort({ createdAt: 1 })
.populate("sender")
.select("-password")
.exec();
this is what i do . setSignedUrl is my utils its just using getSignedUrl function of @aws-sdk/s3-request-presigner
packages under the hood . For example we have 2 message from same user then first execute of for both of avatar signed and setted , at second execute this time key broken because we try sign to signed url , and as far as I know, and according to what is written here, it is impossible to tell if the url is signed or not "The pre-signed URL is simply a local calculation and signing of a URL. There is no interaction with the S3 service"
for (let message of messages) {
message.sender.avatar = await setSignedUrl(message.sender.avatar);
}
after that i did this for temporary solution but this is kind weird not feeling like right solution.
let clonedMessages = JSON.parse(JSON.stringify(messages));
for (let message of clonedMessages) {
message.sender.avatar = await setSignedUrl(message.sender.avatar);
}
I think should be some option for this
Consider this schema:
I do a
Model.find()
and chain the.populate()
method. I am populating theuser
:I receive what I expect:
Once I have the result, I apply a
.map
function to the array to change theuser
:and this
.map()
fails. The first time it goes through thedocuments
array, it successfully creates theuserId
property. The second time it saysdocument.user._id
is undefined. In fact, if I print each of the array's objects, the second iteration produces this:For some mysterious reason, it seems the
.populate()
method, when it populates documents that are the same, uses a single reference. For the time being, it's possible to solve like this (which seems like an ugly hack to me):Any ideas?