Automattic / mongoose

MongoDB object modeling designed to work in an asynchronous environment.
https://mongoosejs.com
MIT License
26.88k stars 3.83k forks source link

Add option to queue operations until `autoCreate` and `autoIndex` finish #11916

Open satasuk03 opened 2 years ago

satasuk03 commented 2 years ago

I have a problem when I switch database before creating timeseries.
it will creates a collection instead o f timeseries.

Here's the code

// db.ts
export const useDB = (guidId: string) =>
  mongoose.connection.useDb(guidId, {
    // ensures connections to the same databases are cached
    useCache: true,
    // remove event listeners from the main connection
    noListener: true,
  });

// =====================================

// schema file
import { useDB } from 'db';
import mongoose, { Schema } from 'mongoose';

export interface ActiveUserDoc {
  timestamp: Date;
  messages: Number;
  metadata: { discordId: string };
}

const ActiveUserSchema = new Schema(
  {
    timestamp: Date,
    messages: Number,
    metadata: {
      discordId: String,
    },
  },
  {
    timeseries: {
      timeField: 'timestamp',
      metaField: 'metadata',
      granularity: 'hours',
    },
    expireAfterSeconds: 60 * 24 * 7, // 7 days
  },
);

export const getModel = (guildId: string) =>
  useDB(guildId).model<ActiveUserDoc, mongoose.Model<ActiveUserDoc>>(
    'ActiveUser',
    ActiveUserSchema,
    'active_user',
  );
image

Originally posted by @satasuk03 in https://github.com/Automattic/mongoose/issues/10611#issuecomment-1150794186

Note: Mongoose Version: 6.3.6
Node Version: 16.13.1
MongoDB Version: 5.0.9

satasuk03 commented 2 years ago

This is the way I temporary fix the problem

import { useDB } from 'db';
import mongoose, { Schema } from 'mongoose';

export interface ActiveUserDoc {
  timestamp: Date;
  messages: Number;
  metadata: { discordId: string };
}

const ActiveUserSchema = new Schema(
  {
    timestamp: Date,
    messages: Number,
    metadata: {
      discordId: String,
    },
  },
  {
    autoCreate: false,
    autoIndex: false,
  },
);

export const getModel = (guildId: string) => {
  const conn = useDB(guildId);
  const model = conn.model<ActiveUserDoc, mongoose.Model<ActiveUserDoc>>(
    'ActiveUser',
    ActiveUserSchema,
    'active_user',
  );
  model.createCollection({
    timeseries: {
      timeField: 'timestamp',
      metaField: 'metadata',
      granularity: 'hours',
    },
    expireAfterSeconds: 60 * 24 * 7, // 7 days
  });
  return model;
};
vkarpov15 commented 2 years ago

Hi, the below script works for me:

'use strict';

const mongoose = require('mongoose');

const useDb = db => mongoose.connection.useDb(db, {
  // ensures connections to the same databases are cached
    useCache: true,
    // remove event listeners from the main connection
    noListener: true,
});

const getModel = db => useDb(db).model('User', schema, 'User');

const schema = new mongoose.Schema({
  timestamp: Date,
  name: String,
  metadata: {}
}, {
    timeseries: {
      timeField: 'timestamp',
      metaField: 'metadata',
      granularity: 'hours',
    },
    expireAfterSeconds: 60 * 24 * 7, // 7 days
  });

void async function main() {
  await mongoose.connect('mongodb://localhost:27017/test');

  const Model1 = getModel('test1');
  await Model1.init();
  console.log('1', await Model1.db.db.listCollections().toArray())

  const Model2 = getModel('test2');
  await Model2.init();
  console.log('2', await Model2.db.db.listCollections().toArray())
}();

Here's the output:

1 [
  {
    name: 'User',
    type: 'timeseries',
    options: { expireAfterSeconds: 10080, timeseries: [Object] },
    info: { readOnly: false }
  },
  {
    name: 'system.buckets.User',
    type: 'collection',
    options: {
      validator: [Object],
      clusteredIndex: true,
      expireAfterSeconds: 10080,
      timeseries: [Object]
    },
    info: {
      readOnly: false,
      uuid: new Binary(Buffer.from("bbb9220f3131434caf149d77cc1bf96e", "hex"), 4)
    }
  },
  {
    name: 'system.views',
    type: 'collection',
    options: {},
    info: {
      readOnly: false,
      uuid: new Binary(Buffer.from("f1d03a873db943ffb45429a4abba8e3f", "hex"), 4)
    },
    idIndex: { v: 2, key: [Object], name: '_id_' }
  }
]
2 [
  {
    name: 'system.views',
    type: 'collection',
    options: {},
    info: {
      readOnly: false,
      uuid: new Binary(Buffer.from("23a44b4b0ac04e4d98b8fa8dfb66dd39", "hex"), 4)
    },
    idIndex: { v: 2, key: [Object], name: '_id_' }
  },
  {
    name: 'User',
    type: 'timeseries',
    options: { expireAfterSeconds: 10080, timeseries: [Object] },
    info: { readOnly: false }
  },
  {
    name: 'system.buckets.User',
    type: 'collection',
    options: {
      validator: [Object],
      clusteredIndex: true,
      expireAfterSeconds: 10080,
      timeseries: [Object]
    },
    info: {
      readOnly: false,
      uuid: new Binary(Buffer.from("b1b44274c05240d9b51e9f4edcbedaf0", "hex"), 4)
    }
  }
]

Both collections have timeseries set. Can you please modify the above script to demonstrate your issue?

satasuk03 commented 2 years ago

Hello mate I found out that if I use insertMany instead of init

const mongoose = require('mongoose');

const schema = new mongoose.Schema(
  {
    timestamp: Date,
    name: String,
    metadata: {
      discordIds: [{ type: String }],
      channelId: String,
    },
  },
  {
    timeseries: {
      timeField: 'timestamp',
      metaField: 'metadata',
      granularity: 'hours',
    },
    expireAfterSeconds: 60 * 24 * 7, // 7 days
  },
);

const useDb = (db) =>
  mongoose.connection.useDb(db, {
    // ensures connections to the same databases are cached
    useCache: true,
    // remove event listeners from the main connection
    noListener: true,
  });

const getModel = (db) => useDb(db).model('User', schema, 'User');

// eslint-disable-next-line no-void
void (async function main() {
  await mongoose.connect(
    'mongodb://localhost:27017/xxxxx',
  );

  const Model1 = getModel('test1');
  //   await Model1.init();
  await Model1.insertMany([
    {
      timestamp: new Date(),
      name: 'John',
      metadata: {
        discordIds: ['1', '2', '3'],
        channelId: '1234',
      },
    },
  ]);
  console.log('1', await Model1.db.db.listCollections().toArray());

  const Model2 = getModel('test2');
  await Model2.init();
  console.log('2', await Model2.db.db.listCollections().toArray());
})();

The result

1 [
  {
    name: 'User',
    type: 'collection',
    options: {},
    info: {
      readOnly: false,
      uuid: new Binary(Buffer.from("8018773a9a50445da07d415a0d6713bf", "hex"), 4)
    },
    idIndex: { v: 2, key: [Object], name: '_id_' }
  },
  {
    name: 'system.views',
    type: 'collection',
    options: {},
    info: {
      readOnly: false,
      uuid: new Binary(Buffer.from("8ac1159bc0b84bfeaaa554942e9452c1", "hex"), 4)
    },
    idIndex: { v: 2, key: [Object], name: '_id_' }
  },
  {
    name: 'system.buckets.User',
    type: 'collection',
    options: {
      validator: [Object],
      clusteredIndex: true,
      expireAfterSeconds: 10080,
      timeseries: [Object]
    },
    info: {
      readOnly: false,
      uuid: new Binary(Buffer.from("c5b7a8a913d244429459923a9cc6db0d", "hex"), 4)
    }
  }
]
2 [
  {
    name: 'system.views',
    type: 'collection',
    options: {},
    info: {
      readOnly: false,
      uuid: new Binary(Buffer.from("3bba182335fa45cab7550783918089ed", "hex"), 4)
    },
    idIndex: { v: 2, key: [Object], name: '_id_' }
  },
  {
    name: 'User',
    type: 'timeseries',
    options: { expireAfterSeconds: 10080, timeseries: [Object] },
    info: { readOnly: false }
  },
  {
    name: 'system.buckets.User',
    type: 'collection',
    options: {
      validator: [Object],
      clusteredIndex: true,
      expireAfterSeconds: 10080,
      timeseries: [Object]
    },
    info: {
      readOnly: false,
      uuid: new Binary(Buffer.from("674b3038771a4da4855c8fcb545dbaf8", "hex"), 4)
    }
  }
]

It will create a Collection instead of timeseries. @vkarpov15

Thanks in advance

vkarpov15 commented 2 years ago

@satasuk03 we've confirmed this issue, but unfortunately the only way Mongoose can support this without the Model.init() call is for Mongoose models to wait for createCollection() to finish before sending any operations to the MongoDB server. We'll add an option to opt in to doing this in a minor release, and consider making this behavior the default for 7.0.

The solution, for now, is to make sure you haven't disabled autoCreate, and make getModel() async:

const getModel = async db => {
  const model = useDb(db).model('User', schema, 'User');
  await model.init();
  return model;
};

Another alternative is to create every collection ahead of time. Unless you're adding new dbs on the fly, you can just go through the dbs you want to create and make sure they all have a 'User' timeseries collection.

The issue seems to be that, if you send insertMany() before Mongoose's internal createCollection() finishes, the MongoDB server will create a non-timeseries collection and the createCollection() will fail.

For example, below is a script that demonstrates this issue with the MongoDB node driver:

const { MongoClient } = require('mongodb');

void async function main() {
  const client = await MongoClient.connect('mongodb://localhost:27017/test');

  const p1 = client.db().createCollection('User', {
    timeseries: {
      timeField: 'timestamp',
      metaField: 'metadata',
      granularity: 'hours',
    }
  });

  await new Promise(resolve => setTimeout(resolve, 0));

  const p2 = client.db().collection('User').insertMany([{ name: 'test' }]);

  await Promise.all([p1, p2]);
  console.log(await client.db().listCollections().toArray());
}();

I opened an issue on the MongoDB Node driver JIRA, we'll see what they say.