Open anekix opened 5 years ago
The switch_db
context manager was built for this use case, is there anything preventing it use?
@bagerard switch_db
can be used only when db aliases are predefined when connection is established for the first time (Taken from here).
But in case of multi-tenant
architecture where dbs cannot be predefined but created at runtime.
ex. a new company database is created for when a company is registered (the architecture itself cannot be changed due to some domain requirements/ compliance)
I have the impression that there might be a workaround if you establish the connections on the fly and use switch_db. Assuming you are naming databases names with a predictable name (e.g organization name):
from mongoengine import *
from mongoengine.context_managers import switch_db
orgs = ['org1', 'org2', 'org3']
conn = connect() # establish a default connection
class MyDoc(Document):
name = StringField()
def __repr__(self):
return 'MyDoc name: {}'.format(self.name)
# Save 1 doc per database
for org in orgs:
print('establishing connection to db "{}"'.format(org))
connect(db=org, alias=org)
with switch_db(MyDoc, org):
print('Saving in {}'.format(org))
MyDoc(name='save_me_in_{}'.format(org)).save()
establishing connection to db "org1" Saving in org1 establishing connection to db "org2" Saving in org2 establishing connection to db "org3" Saving in org3
# Print what was saved
for org in orgs:
with switch_db(MyDoc, org):
print('Doc from db "{}": {}'.format(org, list(MyDoc.objects())))
Doc from db "org1": [MyDoc name: save_me_in_org1] Doc from db "org2": [MyDoc name: save_me_in_org2] Doc from db "org3": [MyDoc name: save_me_in_org3]
That being said and to be honest, your use case is for sure not MongoEngine's standard use case and I wouldn't recommend to start building a large application relying only on this workaround/pattern.
I hope this helps
yes, i am aware of this workraround but as you mentioned this pattern is not enough to start modeling a huge application based on this.whats worse is we loose all the other goodness that mongoengine provides. will it need a major refactor of mongoengine as this is a common usecase for many SaaS applications?
Good question. I'm afraid that it would be a complicated path to have all of mongoengine's feature working in a robust manner with such pattern. Additionally we haven't had that many demand for it and I don't think that django supports it so I really don't think its very common. And last but not least, we have limited development capacity right now and prefer putting effort into improving performance or bug fixing...
I am facing the same issue. For multi-tenant applications where there are varied sized customers its important that we have DBs/customer to prevent smaller one to suffer performance issue due to large customers.
Any new ideas how to come around this issue?
Hi Mongoengine users.
I faced this issue myself, and I tried to build something that would work 'around' mongoengine, not modifying internal objects, in order to keep the library safe with all its features. It has been a long journey !
A bit of context here may be usefull: I want to use several databases, one per tenant, and I want to read/write my objects from all those DBs. I use inheritance for several objects, and for some services I have several threads.
I noticed that both multithreading, and inheritance breaks the switch_db pattern:
• No thread protection is implemented, so if two switch_db run in parallel, the last enter sets the db_alias for all objects.
• If B is a child Document class of A, then querying A within a switch_db, getting result b that is a B instance, and saving b will save b in the default DB. That would mean querying for A within A and B switch_dbs, that is not intuitive, source of bugs, and clearly inheritance anti-pattern.
I wanted something easy to use like:
A = A[“db_alias”].objects.get(id=’1234’)
Main difficulties:
• Class registries have to be set and kept up-to-date for each database
• ReferenceFields need to be defined dynamically
• Inherited classes have to be defined dynamically
Here is a (so far) working solution, at least for our project:
utils.multidb_document.py
import itertools
from mongoengine import Document
from mongoengine.fields import ObjectIdField
class MultidbDocumentItemMeta(Document.my_metaclass):
def __new__(cls, name, bases, attrs, owner):
if not issubclass(owner, MultidbDocument):
raise TypeError("owner shall be a subclass of MultidbDocument !")
# verify the class is the root class, or an item class
is_child = owner.get_root_document_class() is not None
if is_child:
base_list = list(bases)
# Remove MultidbDocumentClass as it is not needed there,
# it will be added via owner.get_root_document_class()
base_list.remove(MultidbDocumentClass)
# Need to inherit from the root class !
# MRO: left to right, the root class is the last !
base_list.append(owner.get_root_document_class())
bases = tuple(base_list)
else:
if "meta" not in attrs:
attrs["meta"] = {}
meta = attrs.get("meta")
meta["abstract"] = True
if not meta.get("id_field"):
# We need to define now the id field
# otherwise graphene will disallow id operations !
id_name, id_db_name = cls.get_auto_id_names(attrs)
attrs[id_name] = ObjectIdField(db_field=id_db_name)
meta["id_field"] = id_name
super_new = super(MultidbDocumentItemMeta, cls).__new__
new_class = super_new(cls, name, bases, attrs)
return new_class
@classmethod
def get_auto_id_names(mcs, attrs):
"""Find a name for the automatic ID field for the given new class.
Return a two-element tuple where the first item is the field name (i.e.
the attribute name on the object) and the second element is the DB
field name (i.e. the name of the key stored in MongoDB).
Defaults to ('id', '_id'), or generates a non-clashing name in the form
of ('auto_id_X', '_auto_id_X') if the default name is already taken.
"""
id_name, id_db_name = ("id", "_id")
existing_fields = {field_name for field_name in attrs}
if id_name not in existing_fields:
return id_name, id_db_name
id_basename, id_db_basename, i = ("auto_id", "_auto_id", 0)
for i in itertools.count():
id_name = "{}_{}".format(id_basename, i)
id_db_name = "{}_{}".format(id_db_basename, i)
if id_name not in existing_fields:
return id_name, id_db_name
class MultidbDocumentMeta(type):
def __getitem__(cls, name):
if not hasattr(cls, name):
DatabaseAliasesRegistry.register_alias(name)
# should update self."x"
return getattr(cls, name)
def init_multidb_document_subclass(*args, **kwargs):
raise TypeError("Multidb document shall not be inherited from. It's a final class. Inheritance of documents are defined in create_document_class function")
class MultidbDocument(metaclass=MultidbDocumentMeta):
my_metaclass = MultidbDocumentMeta
_root_document_class = None
_item_metaclass = None
@classmethod
def create_document_class(cls, db_alias: str):
raise NotImplementedError("The multidb_document subclass {cls} needs to overload create_document_class(db_alias) method")
@classmethod
def get_root_document_class(cls):
"""
This class must be used only for test check purpose, as all classes for databases inherits from it
Be carefull, only for data structure check, eg graphene, no functional db use !
This is a mongoengine 'abstract' document
"""
return cls._root_document_class
def __init_subclass__(cls, **kwargs):
super().__init_subclass__(**kwargs)
# Create the root class from whom inherit !
cls._root_document_class = cls.create_document_class('_root_document_class')
if not issubclass(cls._root_document_class, MultidbDocumentClass):
raise TypeError("create_document_class shall return a subclass of MultidbDocumentClass")
DatabaseAliasesRegistry.suscribe_to_new_db_alias(cls._create_document_class)
for db_alias in DatabaseAliasesRegistry.get_existing_db_aliases():
cls._create_document_class(db_alias)
cls.__init_subclass__ = init_multidb_document_subclass
@classmethod
def _create_document_class(cls, db_alias: str):
if not hasattr(cls, db_alias):
# create class dynamically
#doc_class = cls.create_document_class(db_alias, cls._root_document_class, False)
doc_class = cls.create_document_class(db_alias)
# switch to db_alias, not usefull if meta has been set in class body
doc_class._meta["db_alias"] = db_alias
doc_class._collection = None
setattr(cls, db_alias, doc_class)
class MultidbDocumentClass(Document, metaclass=MultidbDocumentItemMeta, owner=MultidbDocument):
meta={
"abstract": True
}
class DatabaseAliasesRegistry:
"""
Static class with only static methods
This registry will help synchronization between databases
Needed to avoid unknown derived classes
"""
db_aliases = []
new_db_alias_listeners = []
@staticmethod
def register_alias(db_alias:str):
if db_alias not in DatabaseAliasesRegistry.db_aliases:
DatabaseAliasesRegistry.db_aliases.append(db_alias)
for listener in DatabaseAliasesRegistry.new_db_alias_listeners:
listener(db_alias)
@staticmethod
def get_existing_db_aliases():
return DatabaseAliasesRegistry.db_aliases
@staticmethod
def suscribe_to_new_db_alias(callback_function):
DatabaseAliasesRegistry.new_db_alias_listeners.append(callback_function)
Usage:
from utils.multidb_document import MultidbDocument, MultidbDocumentClass
from mongoengine.fields import ReferenceField, IntField, StringField
class C(MultidbDocument):
@classmethod
def create_document_class(cls, db_alias:str):
class C(MultidbDocumentClass, owner=cls):
meta = {
"collection": "c_collection"
}
name = StringField()
return C
class A(MultidbDocument):
@classmethod
def create_document_class(cls, db_alias:str):
class A(MultidbDocumentClass, owner=cls):
meta = {
"collection": "ab_collection"
}
# We use db_alias here !
ReferenceField(C[db_alias])
return A
class B(MultidbDocument):
@classmethod
def create_document_class(cls, db_alias:str):
# MultidbDocumentClass may be ommited here...
# But specified to be consistent
# We also use db_alias here !
class B(A[db_alias], MultidbDocumentClass, owner=cls):
how_much = IntField()
return B
Hope it helps.
Based on the ideas from @BenoitToulet and trying not to modify our existing codebase we came to this solution. So far is working but it's barely tested.
import threading
from typing import Any, Dict
from mongoengine.connection import DEFAULT_CONNECTION_NAME
from mongoengine.document import Document
REGISTRY: Dict[str, Any] = {}
LOCK = threading.Lock()
def ClassFactory(name, BaseClass=Document):
def __init__(self, **kwargs):
BaseClass.__init__(self, **kwargs)
# BaseClass.__init__(self, name[: -len("Class")])
newclass = type(name, (BaseClass,), {"__init__": __init__})
return newclass
class switch_db:
"""switch_db alias context manager.
Example ::
# Register connections
register_connection('default', 'mongoenginetest')
register_connection('testdb-1', 'mongoenginetest2')
class Group(Document):
name = StringField()
Group(name='test').save() # Saves in the default db
with switch_db(Group, 'testdb-1') as Group:
Group(name='hello testdb!').save() # Saves in testdb-1
"""
def __init__(self, cls, db_alias):
"""Construct the switch_db context manager
:param cls: the class to change the registered db
:param db_alias: the name of the specific database to use
"""
new_cls_name = f"{db_alias}_{cls.__module__}.{cls.__name__}"
with LOCK:
new_cls = REGISTRY.get(new_cls_name, None)
if not new_cls:
allow_inheritance = cls._meta["allow_inheritance"]
cls._meta["allow_inheritance"] = True
new_cls = ClassFactory(new_cls_name, cls)
cls._meta["allow_inheritance"] = allow_inheritance
new_cls._meta["allow_inheritance"] = allow_inheritance
REGISTRY[new_cls_name] = new_cls
self.cls = new_cls
self.collection = new_cls._get_collection()
self.db_alias = db_alias
self.ori_db_alias = new_cls._meta.get("db_alias", DEFAULT_CONNECTION_NAME)
def __enter__(self):
"""Change the db_alias and clear the cached collection."""
self.cls._meta["db_alias"] = self.db_alias
self.cls._collection = None
return self.cls
def __exit__(self, t, value, traceback):
"""Reset the db_alias and collection."""
self.cls._meta["db_alias"] = self.ori_db_alias
self.cls._collection = self.collection
Hi Mongoengine users. I faced this issue myself, and I tried to build something that would work 'around' mongoengine, not modifying internal objects, in order to keep the library safe with all its features. It has been a long journey ! A bit of context here may be usefull: I want to use several databases, one per tenant, and I want to read/write my objects from all those DBs. I use inheritance for several objects, and for some services I have several threads. I noticed that both multithreading, and inheritance breaks the switch_db pattern: • No thread protection is implemented, so if two switch_db run in parallel, the last enter sets the db_alias for all objects. • If B is a child Document class of A, then querying A within a switch_db, getting result b that is a B instance, and saving b will save b in the default DB. That would mean querying for A within A and B switch_dbs, that is not intuitive, source of bugs, and clearly inheritance anti-pattern. I wanted something easy to use like:
A = A[“db_alias”].objects.get(id=’1234’)
Main difficulties: • Class registries have to be set and kept up-to-date for each database • ReferenceFields need to be defined dynamically • Inherited classes have to be defined dynamicallyHere is a (so far) working solution, at least for our project:
utils.multidb_document.py
import itertools from mongoengine import Document from mongoengine.fields import ObjectIdField class MultidbDocumentItemMeta(Document.my_metaclass): def __new__(cls, name, bases, attrs, owner): if not issubclass(owner, MultidbDocument): raise TypeError("owner shall be a subclass of MultidbDocument !") # verify the class is the root class, or an item class is_child = owner.get_root_document_class() is not None if is_child: base_list = list(bases) # Remove MultidbDocumentClass as it is not needed there, # it will be added via owner.get_root_document_class() base_list.remove(MultidbDocumentClass) # Need to inherit from the root class ! # MRO: left to right, the root class is the last ! base_list.append(owner.get_root_document_class()) bases = tuple(base_list) else: if "meta" not in attrs: attrs["meta"] = {} meta = attrs.get("meta") meta["abstract"] = True if not meta.get("id_field"): # We need to define now the id field # otherwise graphene will disallow id operations ! id_name, id_db_name = cls.get_auto_id_names(attrs) attrs[id_name] = ObjectIdField(db_field=id_db_name) meta["id_field"] = id_name super_new = super(MultidbDocumentItemMeta, cls).__new__ new_class = super_new(cls, name, bases, attrs) return new_class @classmethod def get_auto_id_names(mcs, attrs): """Find a name for the automatic ID field for the given new class. Return a two-element tuple where the first item is the field name (i.e. the attribute name on the object) and the second element is the DB field name (i.e. the name of the key stored in MongoDB). Defaults to ('id', '_id'), or generates a non-clashing name in the form of ('auto_id_X', '_auto_id_X') if the default name is already taken. """ id_name, id_db_name = ("id", "_id") existing_fields = {field_name for field_name in attrs} if id_name not in existing_fields: return id_name, id_db_name id_basename, id_db_basename, i = ("auto_id", "_auto_id", 0) for i in itertools.count(): id_name = "{}_{}".format(id_basename, i) id_db_name = "{}_{}".format(id_db_basename, i) if id_name not in existing_fields: return id_name, id_db_name class MultidbDocumentMeta(type): def __getitem__(cls, name): if not hasattr(cls, name): DatabaseAliasesRegistry.register_alias(name) # should update self."x" return getattr(cls, name) def init_multidb_document_subclass(*args, **kwargs): raise TypeError("Multidb document shall not be inherited from. It's a final class. Inheritance of documents are defined in create_document_class function") class MultidbDocument(metaclass=MultidbDocumentMeta): my_metaclass = MultidbDocumentMeta _root_document_class = None _item_metaclass = None @classmethod def create_document_class(cls, db_alias: str): raise NotImplementedError("The multidb_document subclass {cls} needs to overload create_document_class(db_alias) method") @classmethod def get_root_document_class(cls): """ This class must be used only for test check purpose, as all classes for databases inherits from it Be carefull, only for data structure check, eg graphene, no functional db use ! This is a mongoengine 'abstract' document """ return cls._root_document_class def __init_subclass__(cls, **kwargs): super().__init_subclass__(**kwargs) # Create the root class from whom inherit ! cls._root_document_class = cls.create_document_class('_root_document_class') if not issubclass(cls._root_document_class, MultidbDocumentClass): raise TypeError("create_document_class shall return a subclass of MultidbDocumentClass") DatabaseAliasesRegistry.suscribe_to_new_db_alias(cls._create_document_class) for db_alias in DatabaseAliasesRegistry.get_existing_db_aliases(): cls._create_document_class(db_alias) cls.__init_subclass__ = init_multidb_document_subclass @classmethod def _create_document_class(cls, db_alias: str): if not hasattr(cls, db_alias): # create class dynamically #doc_class = cls.create_document_class(db_alias, cls._root_document_class, False) doc_class = cls.create_document_class(db_alias) # switch to db_alias, not usefull if meta has been set in class body doc_class._meta["db_alias"] = db_alias doc_class._collection = None setattr(cls, db_alias, doc_class) class MultidbDocumentClass(Document, metaclass=MultidbDocumentItemMeta, owner=MultidbDocument): meta={ "abstract": True } class DatabaseAliasesRegistry: """ Static class with only static methods This registry will help synchronization between databases Needed to avoid unknown derived classes """ db_aliases = [] new_db_alias_listeners = [] @staticmethod def register_alias(db_alias:str): if db_alias not in DatabaseAliasesRegistry.db_aliases: DatabaseAliasesRegistry.db_aliases.append(db_alias) for listener in DatabaseAliasesRegistry.new_db_alias_listeners: listener(db_alias) @staticmethod def get_existing_db_aliases(): return DatabaseAliasesRegistry.db_aliases @staticmethod def suscribe_to_new_db_alias(callback_function): DatabaseAliasesRegistry.new_db_alias_listeners.append(callback_function)
Usage:
from utils.multidb_document import MultidbDocument, MultidbDocumentClass from mongoengine.fields import ReferenceField, IntField, StringField class C(MultidbDocument): @classmethod def create_document_class(cls, db_alias:str): class C(MultidbDocumentClass, owner=cls): meta = { "collection": "c_collection" } name = StringField() return C class A(MultidbDocument): @classmethod def create_document_class(cls, db_alias:str): class A(MultidbDocumentClass, owner=cls): meta = { "collection": "ab_collection" } # We use db_alias here ! ReferenceField(C[db_alias]) return A class B(MultidbDocument): @classmethod def create_document_class(cls, db_alias:str): # MultidbDocumentClass may be ommited here... # But specified to be consistent # We also use db_alias here ! class B(A[db_alias], MultidbDocumentClass, owner=cls): how_much = IntField() return B
Hope it helps.
Hello mongoengine users.
A little update to fix an issue with inheritance pattern:
class MultidbDocumentClass(Document, metaclass=MultidbDocumentItemMeta, owner=MultidbDocument):
meta = {"abstract": True}
@classmethod
def _from_son(cls, son, _auto_dereference=True, only_fields=None, created=False):
"""Need to overload this method when retreiving"""
# Get the class name from the document, falling back to the given
# class if unavailable
class_name = son.get("_cls", cls._class_name)
if class_name != cls._class_name:
for sub_class in cls.__subclasses__():
if sub_class._class_name == class_name:
return sub_class._from_son(son, _auto_dereference, only_fields, created)
# Else: keep default behaviour
return super()._from_son(son, _auto_dereference, only_fields, created)
This solution works fine for nearly one year on our side.
i will try to describe my usecase here: i am using a
multi tenant
db model in which each client has their own database( due to very specific business use case). suppose i have to get the users details for a clientx
that has its own db namedx
, these are the steps i need to execute to get the data:admin
db & get admin detailsx
db & get list of userscc
db & insert some dataall of the above operations are to be executed in a single request to get the required results.
basically i need to map a
Document
class to multiple dbs which are created dynamically(as clients register) so i cannot hard code this in thedocument definition
.i guess the default behavior of mongoengine is to bind each Document to a specific database as soon as the code defining the document is executed, but we somehow need to bind documents to a db's dynamically.
other issues that are almost related:
1610
any specific architectural reason to bind a document to a database at the first execution of document?