Automattic / mongoose

MongoDB object modeling designed to work in an asynchronous environment.
https://mongoosejs.com
MIT License
26.91k stars 3.83k forks source link

Bug: Auto generated _id for nested array #11499

Closed MarcusElevait closed 2 years ago

MarcusElevait commented 2 years ago

Do you want to request a feature or report a bug?

Bug

What is the current behavior?

We have the following schema defined:

import { Schema } from 'mongoose';
import { MongoDbCollections } from '../../mongodb-collections.enum';

const options = {
    collection: MongoDbCollections.ValidationShape,
    toJSON: { virtuals: true },
    toObject: { virtuals: true },
    versionKey: false,
    strict: false,
};

const ValidationShapeSchema = new Schema(
    {
        '@id': String,
        '@type': String,
        '@context': String,
        relatedCreativeWork: String,
        'rdfs:comment': String,
        'rdfs:label': String,
        'sh:targetClass': {
            '@id': String,
        },
        'sh:path': {
            type: {
                '@list': [
                    {
                        '@id': String,
                    },
                ],
            },
        },
        'sh:qualifiedMinCount': Number,
        'sh:qualifiedMaxCount': Number,
        'sh:qualifiedValueShape': {},
        'sh:sparql': {
            type: {
                'sh:message': String,
                'sh:select': String,
                'sh:prefixes': {
                    'sh:declare': [
                        {
                            'sh:prefix': String,
                            'sh:namespace': {
                                '@type': String,
                                '@value': String,
                            },
                        },
                    ],
                },
            },
        },
        'sh:rule': {
            type: {
                '@type': String,
                'sh:order': Number,
                'sh:condition': {
                    '@id': String,
                    '@type': String,
                    'sh:targetClass': { '@id': String },
                    'sh:and': {
                        '@list': Array,
                    },
                    'sh:or': {
                        '@list': Array,
                    },
                },
                'sh:construct': String,
                'sh:prefixes': Object,
            },
        },
    },
    options,
);

export { ValidationShapeSchema };

Recently when using bulkwrite with multiple updateOne (and upsert option enabled) calls, there is automatically an _id generated for the sh:path.@list object and for the objects inside the array of @list. Can't say exactly since which version update this is happening, but it's definitely a new behavior.

tsconfig

{
    "extends": "../../tsconfig.base.json",
    "compilerOptions": {
        "types": ["node", "jest"],
        "emitDecoratorMetadata": true,
        "target": "ESNext"
    },
    "include": [],
    "files": [],
    "references": [
        {
            "path": "./tsconfig.app.json"
        },
        {
            "path": "./tsconfig.compodoc.json"
        },
        {
            "path": "./tsconfig.spec.json"
        }
    ]
}

What is the expected behavior?

Expected behavior would be that there is no _id generated, like it was before.

What are the versions of Node.js, Mongoose and MongoDB you are using? Note that "latest" is not a version.

6.1.8 -> mongoose 4.2.2 -> mongodb 16.14.0 -> Node

Also updated mongoose to 6.2.4, but this didn't change the behavior.

IslandRhythms commented 2 years ago

const mongoose = require('mongoose');

const options = {
    toJSON: { virtuals: true },
    toObject: { virtuals: true },
    versionKey: false,
    strict: false,
};

const testSchema = new mongoose.Schema({
    'sh:path': {
        type: {
            '@list': [
                {
                    '@id': String
                }
            ]
        }
    }
}, options);

const Test = mongoose.model('Test', testSchema);

async function run() {
    await mongoose.connect('mongodb://localhost:27017');
    await mongoose.connection.dropDatabase();

    await Test.create({
        'sh:path': {'@list': [{'@id': '12345'}]}
    });
    const entry = await Test.findOne();
    console.log(entry['sh:path']['@list'][0])
}

run();
vkarpov15 commented 2 years ago

This seems like expected behavior, Mongoose adds _id for subdocuments by default. You can disable it as shown below:

        'sh:path': {
            type: {
                '@list': [
                    new Schema({
                        '@id': String,
                    }, { _id: false }), // <-- disable `_id`
                ],
            },
        },
MarcusElevait commented 2 years ago

Okay, thanks. But it's a newly added behavior, right? Because we have this schema for more than two years and the addition of _id hasn't happen before.

And why is it just for subdocuments inside of an array? As you can see in our schema we have also for example this one:

'sh:targetClass': {
            '@id': String,
        },

And for this is no _id created.

vkarpov15 commented 2 years ago

@MarcusElevait no that is not a newly added behavior. Mongoose has added _id by default to new schemas for over a decade.

sh:targetClass is a nested object, not a subdocument. While nested objects and subdocuments look the same in MongoDB, there's a few small differences for things like validator context, hooks, and adding _id by default.

MarcusElevait commented 2 years ago

@vkarpov15 Thanks for the explanation and good to know this difference and when it is appearing.