citusdata / django-multitenant

Python/Django support for distributed multi-tenant databases like Postgres+Citus
MIT License
730 stars 118 forks source link

tenant column could not be identified when getting Model class using apps.get_model method in migration #138

Closed gurkanindibay closed 1 year ago

gurkanindibay commented 1 year ago

When we try to get Model class using apps.get_model method as defined in Django document below, we're having problem identifying tenant_column in https://github.com/citusdata/django-multitenant/blob/18423f340c758cfa756b1ec22460d3631ad5a545/django_multitenant/utils.py#L63 method getting the error in the stackoverflow post below

https://stackoverflow.com/questions/75219227/getting-no-field-found-in-account-with-column-name-when-creating-migration

HengJunXi commented 1 year ago

You will need to do this in the migration:

Account = apps.get_model("quickstart", "Account")
setattr(Account, "tenant_id", "id")
gurkanindibay commented 1 year ago

Thanks @HengJunXi for the workaround solution. I didn't see the problem in the main branch. Previous fixes may solve the issue

ebspeter commented 1 year ago

You will need to do this in the migration:

Account = apps.get_model("quickstart", "Account")
setattr(Account, "tenant_id", "id")

This should be added to the documentation

HengJunXi commented 1 year ago

@gurkanindibay I am using version 3.10 now. The same problem still happens, and the workaround is still needed. It doesn't recognise the tenant_id set. I think the added tests (https://github.com/citusdata/django-multitenant/pull/152) are passing because the TenantManager is not actually used in RunPython (use_in_migrations is not set to True, ref: https://docs.djangoproject.com/en/4.1/topics/migrations/#model-managers). Hence, it will not run into the problem of AttributeError on tenant_id (because it is not retrieved at all).

I've set use_in_migrations to True in my TenantManager, hence it is used in the RunPython (to separate the tenants accordingly). The problem is, the manager is using attribute (tenant_id) defined in the model, which is not available when the model is retrieved through apps.get_model (returning ModelBase object). Maybe we should look into moving tenant_id to Meta, not sure whether that will works

Error stack:

File "/code/manage.py", line 21, in main() File "/code/manage.py", line 17, in main execute_from_command_line(sys.argv) File "/root/.local/share/virtualenvs/code-_Py8Si6I/lib/python3.10/site-packages/django/core/management/init.py", line 419, in execute_from_command_line utility.execute() File "/root/.local/share/virtualenvs/code-_Py8Si6I/lib/python3.10/site-packages/django/core/management/init.py", line 413, in execute self.fetch_command(subcommand).run_from_argv(self.argv) File "/root/.local/share/virtualenvs/code-_Py8Si6I/lib/python3.10/site-packages/django/core/management/base.py", line 354, in run_from_argv self.execute(*args, cmd_options) File "/root/.local/share/virtualenvs/code-_Py8Si6I/lib/python3.10/site-packages/django/core/management/base.py", line 398, in execute output = self.handle(*args, *options) File "/root/.local/share/virtualenvs/code-_Py8Si6I/lib/python3.10/site-packages/django/core/management/base.py", line 89, in wrapped res = handle_func(args, kwargs) File "/root/.local/share/virtualenvs/code-_Py8Si6I/lib/python3.10/site-packages/django/core/management/commands/migrate.py", line 244, in handle post_migrate_state = executor.migrate( File "/root/.local/share/virtualenvs/code-_Py8Si6I/lib/python3.10/site-packages/django/db/migrations/executor.py", line 117, in migrate state = self._migrate_all_forwards(state, plan, full_plan, fake=fake, fake_initial=fake_initial) File "/root/.local/share/virtualenvs/code-_Py8Si6I/lib/python3.10/site-packages/django/db/migrations/executor.py", line 147, in _migrate_all_forwards state = self.apply_migration(state, migration, fake=fake, fake_initial=fake_initial) File "/root/.local/share/virtualenvs/code-_Py8Si6I/lib/python3.10/site-packages/django/db/migrations/executor.py", line 227, in apply_migration state = migration.apply(state, schema_editor) File "/root/.local/share/virtualenvs/code-_Py8Si6I/lib/python3.10/site-packages/django/db/migrations/migration.py", line 126, in apply operation.database_forwards(self.app_label, schema_editor, old_state, project_state) File "/root/.local/share/virtualenvs/code-_Py8Si6I/lib/python3.10/site-packages/django/db/migrations/operations/special.py", line 190, in database_forwards self.code(from_state.apps, schema_editor) File "/code/chat/migrations/0054_populate_button_in_refined_data_structure.py", line 45, in forward_populate_button for tenant_account in TenantAccount.objects.all(): File "/root/.local/share/virtualenvs/code-_Py8Si6I/lib/python3.10/site-packages/django/db/models/query.py", line 280, in iter self._fetch_all() File "/root/.local/share/virtualenvs/code-_Py8Si6I/lib/python3.10/site-packages/django/db/models/query.py", line 1324, in _fetch_all self._result_cache = list(self._iterable_class(self)) File "/root/.local/share/virtualenvs/code-_Py8Si6I/lib/python3.10/site-packages/django/db/models/query.py", line 69, in iter obj = model_cls.from_db(db, init_list, row[model_fields_start:model_fields_end]) File "/root/.local/share/virtualenvs/code-_Py8Si6I/lib/python3.10/site-packages/django/db/models/base.py", line 515, in from_db new = cls(values) File "/root/.local/share/virtualenvs/code-_Py8Si6I/lib/python3.10/site-packages/django_multitenant/mixins.py", line 124, in init super().init(args, **kwargs) File "/root/.local/share/virtualenvs/code-_Py8Si6I/lib/python3.10/site-packages/django/db/models/base.py", line 418, in init self._state = ModelState() File "/root/.local/share/virtualenvs/code-_Py8Si6I/lib/python3.10/site-packages/django_multitenant/mixins.py", line 138, in setattr attrname in (self.tenant_field, get_tenant_field(self).name) File "/root/.local/share/virtualenvs/code-_Py8Si6I/lib/python3.10/site-packages/django_multitenant/mixins.py", line 209, in tenant_field raise AttributeError( AttributeError: tenant_id field not found. Please add tenant_id field to the model.

gurkanindibay commented 1 year ago

@HengJunXi Could you give some more details with your usage of use_in_migrations and about your multi-tenancy need in the migration. I'm asking it since I need to write a test case to reproduce it.

HengJunXi commented 1 year ago

@gurkanindibay

With use_in_migrations=True in TenantManager, the below snippet will fail in RunPython:

Account = apps.get_model("tests", "Account")
for account in Account.objects.all():
  set_current_tenant(account)

use_in_migrations=True is used to allow usage of TenantManager in migration, else the tenant column will not be set appropriately even with set_current_tenant, having a risk of wrongly modifying data across tenants

gurkanindibay commented 1 year ago

So you are performing a data migration in tenant basis and waiting for it is performed tenant basis, right? That is the test case I think

gurkanindibay commented 1 year ago

Hey @HengJunXi I worked on this issue today and here is my findings Problem is not about the Manager class. Problem is about fake classes being created in the migration process. I could not find a documentation to understand the motivation behind these fake class mechanism but when I ask for ChatGPT I got the answer below which makes sense for me

 Question: why are we getting __fake__ type while we are getting the model with apps.get_model in migration? 
Answer: When you retrieve a model using apps.get_model() in a migration, you may see a __fake__ type for 
the model's class.

This is because during a migration, the models defined in your project may not yet exist in the database schema.
To prevent errors during the migration process, Django uses a "fake" model to represent the state of the model 
before it is actually created in the database schema. This fake model has the same name and attributes as the 
real model, but its type is  different (<class 'django.db.models.base.ModelBase'> instead of <class 
'django.db.models.base.Model'>) and it has a Meta.managed = False attribute to indicate that it is not yet 
managed by the database schema.

You can still use the __fake__ model to perform operations like creating database tables, but some operations 
may not be  available until the model is actually created in the database schema.
Once the migration is complete and the model is created in the database schema, the __fake__ model will be 
replaced with the actual model class. 

I tried different alternatives but getting the transient tenant_id field is impossible. There are two alternatives that I can suggest and I will document them as well

  1. Use apps module from to get the model you want to use instead of the 'apps' in the method being e
  2. xecuted in RunPython
from django.apps import apps  
MigrationUseInMigrationsModel = apps.get_model("tests", "MigrationUseInMigrationsModel")
MigrationUseInMigrationsModel.objects.create(name="test")
  1. Use directly the model class import
    
    from .models import  MigrationUseInMigrationsModel 

MigrationUseInMigrationsModel.objects.create(name="test")



Thanks for your input and contributions
lfriel commented 1 year ago

Hello! Was reading through this issue and am concerned about the recommendation to import the model directly into a migration. Unless I'm misunderstanding the recommendation, this is risky because the "live" model will become out of sync with the data migration, so the migrations may work at first but could break later as the model changes.

Using RunPython in migrations requires code that accepts "an instance of django.apps.registry.Apps containing historical models that match the operation’s place in the project history" (see RunPython documentation https://docs.djangoproject.com/en/4.2/ref/migration-operations/#runpython). Using this apps parameter in apps.get_model ensures the data migration will work, regardless of the current state of the model, because it uses the historical models at that point in the migration history. If you don't do this, your migrations will be flaky.

This brief article https://codereviewdoctor.medium.com/avoiding-flaky-migration-1fc71c7cdb66 and this stack overflow answer https://stackoverflow.com/a/37769213 give good, simple examples about the risks of importing models from the code base directly into migrations.

HengJunXi commented 1 year ago

I take a look on the referenced PR, seems like it is now raising error when obtaining the model in migration RunPython through the conventional way of apps from the function. This is breaking changes, and seems unnecessary. So far from my experience, my workaround is able to get the job done without compromising much.

mjvankampen commented 3 months ago

For me the workaround failed in the docs/as described by @gurkanindibay fails. Because when you import from apps directly you get the state of the model as it currently defined in your code, not as it was at the point of that (older) migration. @HengJunXi workaround does work for me. That is the one that should be in the docs imho.