Automattic / mongoose

MongoDB object modeling designed to work in an asynchronous environment.
https://mongoosejs.com
MIT License
26.95k stars 3.84k forks source link

MongooseArray.pull Does Not Work on Subdocuments #3341

Closed c1moore closed 9 years ago

c1moore commented 9 years ago

When using MongooseArray.pull to remove a subdocument from an array using a command such as:

someUser.attending.pull({event_id : x, user_id : y})

Mongoose will remove all subdocuments from the array that either have event_id set to x OR user_id set to y. This is unexpected since MongoDB's $pull would remove only subdocuments that have event_id set to x AND user_id set to y.

If this is a feature, this should be clearly documented.

vkarpov15 commented 9 years ago

That's not how .pull() works. It's really only meant to pull by the document's _id field, but it does fall back to deep equality if neither doc has an _id. However, unless you explicitly make sure the subdocs don't have an _id, the above pull will never remove any docs. For instance, the below script

var assert = require('assert');
var mongoose = require('mongoose');

var Schema = mongoose.Schema;

mongoose.connect('mongodb://localhost:27017/gh3341');
mongoose.set('debug', true);

var UserSchema = new Schema({
  attending: [{ eventId: Number, userId: Number }]
});

var User = mongoose.model('gh3341', UserSchema);

var x = new User({ attending: [{ eventId: 1, userId: 2 }, { eventId: 2, userId: 2 }] });

console.log(x);
x.attending.pull({ eventId: 1, userId: 2 });
console.log(x);
process.exit(0);

outputs

$ node gh-3341.js 
{ _id: 55f1a21444efb8bc488d3b5b,
  attending: 
   [ { eventId: 1, userId: 2, _id: 55f1a21444efb8bc488d3b5d },
     { eventId: 2, userId: 2, _id: 55f1a21444efb8bc488d3b5c } ] }
{ _id: 55f1a21444efb8bc488d3b5b,
  attending: 
   [ { eventId: 1, userId: 2, _id: 55f1a21444efb8bc488d3b5d },
     { eventId: 2, userId: 2, _id: 55f1a21444efb8bc488d3b5c } ] }

Can you please provide additional context so I can reproduce your issue?

c1moore commented 9 years ago

My subdocuments do not explicitly contain an _id field. For instance, I declared the UserSchema as follows:

var ListSchema = new Schema({
    user_id:    {type: mongoose.Schema.Types.ObjectId, ref:'User'},
    event_id:   {type: mongoose.Schema.Types.ObjectId, ref:'Event'}
}, {_id:false});

var UserSchema = new Schema({
    attending : {
        type : [ListSchema]
    }
});

I would also like to point out that the information you just provided about how .pull() is supposed to be used it not available on the documentation at all. For instance, how .pull() behaves with and without an _id field.

vkarpov15 commented 9 years ago
var assert = require('assert');
var mongoose = require('mongoose');

var Schema = mongoose.Schema;

mongoose.connect('mongodb://localhost:27017/gh3341');
mongoose.set('debug', true);

var subSchema = new Schema({ eventId: Number, userId: Number }, { _id: false });

var UserSchema = new Schema({
  attending: [subSchema]
});

var User = mongoose.model('gh3341', UserSchema);

var x = new User({ attending: [{ eventId: 1, userId: 2 }, { eventId: 2, userId: 2 }] });

console.log(x);
x.attending.pull({ eventId: 1, userId: 2 });
console.log(x);
process.exit(0);

still outputs the correct result:

$ node gh-3341.js 
{ _id: 55f1c7a4b1e0131560ee268b,
  attending: [ { eventId: 1, userId: 2 }, { eventId: 2, userId: 2 } ] }
{ _id: 55f1c7a4b1e0131560ee268b,
  attending: [ { eventId: 2, userId: 2 } ] }
c1moore commented 9 years ago

I use actual ObjectIds for user_id and event_id. This may make a difference. It looks like the following code iterates through the array and removes any element that contains an ObjectId equal to any ObjectIds you specified in the query, which is what I observed:

pull: function () {
    var values = [].map.call(arguments, this._cast, this)
      , cur = this._parent.get(this._path)
      , i = cur.length
      , mem;

    while (i--) {
      mem = cur[i];
      if (mem instanceof EmbeddedDocument) {
        if (values.some(function (v) { return v.equals(mem); } )) {
          [].splice.call(cur, i, 1);
        }
      } else if (~cur.indexOf.call(values, mem)) {
        [].splice.call(cur, i, 1);
      }
    }

    if (values[0] instanceof EmbeddedDocument) {
      this._registerAtomic('$pullDocs', values.map( function (v) { return v._id; } ));
    } else {
      this._registerAtomic('$pullAll', values);
    }

    this._markModified();
    return this;
  }
vlapo commented 9 years ago

Hi,

I think this issue is only about not accurate documentation. As @vkarpov15 mentioned MongooseArray#pull works only with array of primitive variables or ObjectId referencies. (but IMHO docs did`t say anything about advanced $pull queries)

And maybe all this is about feature request MongooseDocumentArray#pull(). - a little bit tricky because of complex mongodb $pull queries.

But I noticed this.

var assert = require('assert');
var mongoose = require('mongoose');

var Schema = mongoose.Schema;

mongoose.connect('mongodb://localhost:27017/gh3341');
mongoose.set('debug', true);

var subSchema = new Schema({ eventId: Number, userId: Number }, { _id: false });

var UserSchema = new Schema({
    attending: [subSchema]
});

var User = mongoose.model('gh3341', UserSchema);

var x = new User({ attending: [{ eventId: 1, userId: 2 }, { eventId: 2, userId: 2 }] });

x.save(function(err) {
    if(err) console.error(err);

    console.log(x);
    x.attending.pull({ eventId: 1, userId: 2 });
    console.log(x);

    x.save(function(err) {
        if (err) console.error(err);

        console.log(x);
    });

    process.exit(0);
});

with output:

Mongoose: gh3341.insert({ _id: ObjectId("55f1f713cb8a54d0065985d3"), attending: [ { eventId: 1, userId: 2 }, { eventId: 2, userId: 2 } ], __v: 0 })
{ __v: 0,
  _id: 55f1f713cb8a54d0065985d3,
  attending: [ { eventId: 1, userId: 2 }, { eventId: 2, userId: 2 } ] }
{ __v: 0,
  _id: 55f1f713cb8a54d0065985d3,
  attending: [ { eventId: 2, userId: 2 } ] }

Local mongoose document are changed but there is no request to MongoDB on save(). So there is inconsistency between local document and saved document.

c1moore commented 9 years ago

I was able to recreate my problem using ObjectIds as stated above. The problem with the code submitted by @vlapo is that x is not updated after saving to the database (without retrieving it using a variation of find()), the code does not use ObjectIds, and he is exiting the program before the last callback from save() is executed so the last console.log() is never executed (this is also why "there is inconsistency between local document and saved document"). Here is the code that recreates the problem:

var assert = require('assert');
var mongoose = require('mongoose');

var Schema = mongoose.Schema;

mongoose.connect('mongodb://localhost:27017/gh3341');
mongoose.set('debug', true);

var subSchema = new Schema({
    user_id:    {type: mongoose.Schema.Types.ObjectId},
    event_id:   {type: mongoose.Schema.Types.ObjectId}
}, {_id:false});

var UserSchema = new Schema({
    attending: {
        type : [subSchema]
    }
});

var User = mongoose.model('gh3341', UserSchema);

var event1Id = new mongoose.Types.ObjectId(),
    event2Id = new mongoose.Types.ObjectId(),
    referencedUserId = new mongoose.Types.ObjectId();

var x = new User({
    attending: [
        { event_id: event1Id, user_id: referencedUserId },
        { event_id: event2Id, user_id: referencedUserId }
    ]
});

x.save(function(err) {
    if(err) console.error(err);

    console.log("\n\nBefore .pull(): ");
    console.log(x);

    x.attending.pull({event_id: event1Id, user_id: referencedUserId});

    console.log("\n\nAfter .pull(), before .save(): ");
    console.log(x);

    x.save(function(err) {
        if (err) console.error(err);

        User.findOne({_id : x._id}, function(err, new_x) {
            if(err) console.error(err);

            console.log("\n\nAfter .pull(), after .save(): ");
            console.log(new_x);

            process.exit(0);
        });
    });
});

The output is (notice the last line where attending is empty):

$ node mongooseTest.js
Mongoose: gh3341.insert({ _id: ObjectId("55f211ced24d667a383b4e0c"), attending: [ { event_id: ObjectId("55f211ced24d667a383b4e09"), user_id: ObjectId("55f211ced24d667a383b4e0b") }, { event_id: ObjectId("55f211ced24d667a383b4e0a"), user_id: ObjectId("55f211ced24d667a383b4e0b") } ], __v: 0 }) {}  

Before .pull(): 
{ __v: 0,
  _id: 55f211ced24d667a383b4e0c,
  attending: 
   [ { event_id: 55f211ced24d667a383b4e09,
       user_id: 55f211ced24d667a383b4e0b },
     { event_id: 55f211ced24d667a383b4e0a,
       user_id: 55f211ced24d667a383b4e0b } ] }

After .pull(), before .save(): 
{ __v: 0,
  _id: 55f211ced24d667a383b4e0c,
  attending: 
   [ { event_id: 55f211ced24d667a383b4e0a,
       user_id: 55f211ced24d667a383b4e0b } ] }
Mongoose: gh3341.update({ __v: 0, _id: ObjectId("55f211ced24d667a383b4e0c") }) { '$pull': { attending: { _id: { '$in': [ '\u001b[90mundefined\u001b[39m' ] } } }, '$inc': { __v: 1 } } {} 
Mongoose: gh3341.findOne({ _id: ObjectId("55f211ced24d667a383b4e0c") }) { fields: undefined }  

After .pull(), after .save(): 
{ _id: 55f211ced24d667a383b4e0c, __v: 1, attending: [] }
c1moore commented 9 years ago

I was also able to recreate this using Number. The code is:

var assert = require('assert');
var mongoose = require('mongoose');

var Schema = mongoose.Schema;

mongoose.connect('mongodb://localhost:27017/gh3341');
mongoose.set('debug', true);

var subSchema = new Schema({
    user_id:    {type: Number},
    event_id:   {type: Number}
}, {_id:false});

var UserSchema = new Schema({
    attending: {
        type : [subSchema]
    }
});

var User = mongoose.model('gh3341', UserSchema);

var event1Id = 1,
    event2Id = 2,
    referencedUserId = 3;

var x = new User({
    attending: [
        { event_id: event1Id, user_id: referencedUserId },
        { event_id: event2Id, user_id: referencedUserId }
    ]
});

x.save(function(err) {
    if(err) console.error(err);

    console.log("\n\nBefore .pull(): ");
    console.log(x);

    x.attending.pull({event_id: event1Id, user_id: referencedUserId});

    console.log("\n\nAfter .pull(), before .save(): ");
    console.log(x);

    x.save(function(err) {
        if (err) console.error(err);

        User.findOne({_id : x._id}, function(err, new_x) {
            if(err) console.error(err);

            console.log("\n\nAfter .pull(), after .save(): ");
            console.log(new_x);

            process.exit(0);
        });
    });
});

And the output is:

$ node mongooseTest.js
Mongoose: gh3341.insert({ _id: ObjectId("55f216205fafa8af3c823f2a"), attending: [ { event_id: 1, user_id: 3 }, { event_id: 2, user_id: 3 } ], __v: 0 }) {}  

Before .pull(): 
{ __v: 0,
  _id: 55f216205fafa8af3c823f2a,
  attending: [ { event_id: 1, user_id: 3 }, { event_id: 2, user_id: 3 } ] }

After .pull(), before .save(): 
{ __v: 0,
  _id: 55f216205fafa8af3c823f2a,
  attending: [ { event_id: 2, user_id: 3 } ] }
Mongoose: gh3341.update({ __v: 0, _id: ObjectId("55f216205fafa8af3c823f2a") }) { '$pull': { attending: { _id: { '$in': [ '\u001b[90mundefined\u001b[39m' ] } } }, '$inc': { __v: 1 } } {} 
Mongoose: gh3341.findOne({ _id: ObjectId("55f216205fafa8af3c823f2a") }) { fields: undefined }  

After .pull(), after .save(): 
{ _id: 55f216205fafa8af3c823f2a, __v: 1, attending: [] }
vlapo commented 9 years ago

facepalm you are right, I`m really sorry @c1moore. It was dark night for me. :) But there is reproducing

But I still think this behavior is unexpected because MongooseArray#pull() works only with array of ObjectId or array of primitive variables. And there are two options

c1moore commented 9 years ago

We all have one of those nights now and then. :)

It seems like this is unexpected behavior based on what @vkarpov15 said and based on the results he was showing (which seemed to be what he expected). He showed that the .pull() only pulled subdocuments that had BOTH fields set to the queried values, but my results above show that the database removes any subdocuments that have any field set to a queried value.

c1moore commented 9 years ago

Correction, it seems it pulls any subdocuments:

var assert = require('assert');
var mongoose = require('mongoose');

var Schema = mongoose.Schema;

mongoose.connect('mongodb://localhost:27017/gh3341');
mongoose.set('debug', true);

var subSchema = new Schema({
    user_id:    {type: Number},
    event_id:   {type: Number}
}, {_id:false});

var UserSchema = new Schema({
    attending: {
        type : [subSchema]
    }
});

var User = mongoose.model('gh3341', UserSchema);

var event1Id = 1,
    event2Id = 2,
    referencedUserId = 3;

var x = new User({
    attending: [
        { event_id: event1Id, user_id: referencedUserId },
        { event_id: event2Id, user_id: referencedUserId },
        { event_id: 5, user_id: 6}
    ]
});

x.save(function(err) {
    if(err) console.error(err);

    console.log("\n\nBefore .pull(): ");
    console.log(x);

    x.attending.pull({event_id: event1Id, user_id: referencedUserId});

    console.log("\n\nAfter .pull(), before .save(): ");
    console.log(x);

    x.save(function(err) {
        if (err) console.error(err);

        User.findOne({_id : x._id}, function(err, new_x) {
            if(err) console.error(err);

            console.log("\n\nAfter .pull(), after .save(): ");
            console.log(new_x);

            process.exit(0);
        });
    });
});

And this is the output:

$ node mongooseTest.js
Mongoose: gh3341.insert({ _id: ObjectId("55f217d49d9dd3d73eec8377"), attending: [ { event_id: 1, user_id: 3 }, { event_id: 2, user_id: 3 }, { event_id: 5, user_id: 6 } ], __v: 0 }) {}  

Before .pull(): 
{ __v: 0,
  _id: 55f217d49d9dd3d73eec8377,
  attending: 
   [ { event_id: 1, user_id: 3 },
     { event_id: 2, user_id: 3 },
     { event_id: 5, user_id: 6 } ] }

After .pull(), before .save(): 
{ __v: 0,
  _id: 55f217d49d9dd3d73eec8377,
  attending: [ { event_id: 2, user_id: 3 }, { event_id: 5, user_id: 6 } ] }
Mongoose: gh3341.update({ __v: 0, _id: ObjectId("55f217d49d9dd3d73eec8377") }) { '$pull': { attending: { _id: { '$in': [ '\u001b[90mundefined\u001b[39m' ] } } }, '$inc': { __v: 1 } } {} 
Mongoose: gh3341.findOne({ _id: ObjectId("55f217d49d9dd3d73eec8377") }) { fields: undefined }  

After .pull(), after .save(): 
{ _id: 55f217d49d9dd3d73eec8377, __v: 1, attending: [] }
c1moore commented 9 years ago

Even without the last comment showing MongoDB pulls all the documents, I would consider the behavior of .pull() on a subdocument without an _id field buggy. I believe this to be the case since what is happening in Mongoose is what I would expect (pull all subdocuments with ALL fields that equal the query), but MongoDB pulls different subdocuments (in this case all subdocuments).

That is, x showed the expected result in the section After .pull(), before .save() where it pulled the only subdocument that had event_id set to event1Id AND user_id set to referencedUserId, but MongoDB did something completely different.

govindrai commented 7 years ago

@vkarpov15 Just trying to understand the conclusion of this conversation: with your last commit to the test file, did the issue get resolved, as in did new code get written to solve the issue, or did your new test case prove the method was indeed working correctly. It seems the latter to me but I'm pretty new to reading issue pages and there was no reply after @c1moore last's response. Thanks!

vkarpov15 commented 7 years ago

There was new code added to solve the issue. Before, mongoose would not handle pull() correctly on arrays with documents that don't have an _id. The fix for this issue was a fallback to a deep equality check (which mongodb does internally, not mongoose) when there's no _id.

yegor211 commented 2 years ago

@vkarpov15 I'd like to get a follow-up on this.

Do I understand right that MongooseArray.pull() is a simplified version of the $pull query? For me personally (based on docs and a lot of digging through the Google) it's quite unclear of what I can and cannot do with .pull()

As of 2022, it still seems to rely on _id's, if you get rid of _id's there's a bit more you can do, but still not sure if you should really use that anywhere besides primitives.

yegor211 commented 2 years ago

I must clarify, that I mainly mean working with subdocuments*

vkarpov15 commented 2 years ago

@yegor211 not exactly. MongooseArray pull() lets you remove subdocuments by _id. pull() then records a $pull operation that Mongoose will send to MongoDB the next time you call save(). Does that help?