django-guardian / django-guardian

Per object permissions for Django
https://django-guardian.readthedocs.io/
Other
3.61k stars 569 forks source link

shortcut get_objects_for_user makes too many database queries #189

Open OdifYltsaeb opened 10 years ago

OdifYltsaeb commented 10 years ago

Im using profiling middleware (there are several, if you google, you find some of them) to try and find the chokepoints of my project. One thing i found was repeating database queries like :

0.043 : SELECT "django_content_type"."id", "django_content_type"."name", "django_content_type"."app_label", "django_content_type"."model" FROM "django_content_type" INNER JOIN "auth_permission" ON ("django_content_type"."id" = "auth_permission"."content_type_id") WHERE ("auth_permission"."codename" = 'can_see_history' AND "django_content_type"."app_label" = 'cars' )

I tracked resposible code to this line: cars = get_objects_for_user(user, ('cars.can_access_car','cars.can_see_history'), klass = Car).order_by('reg_no')

And after testing it seems that each permission triggers one previously posted database query. So if i have 2 permissions i get 2 such queries, if i have one, i get one.

to further illustrate the issue and make it crystal clear: If django is like that: cars = get_objects_for_user(user, ('cars.can_access_car','cars.can_see_history'), klass = Car).order_by('reg_no')

Then following queries are like that: 0.042 : SELECT "django_content_type"."id", "django_content_type"."name", "django_content_type"."app_label", "django_content_type"."model" FROM "django_content_type" INNER JOIN "auth_permission" ON ("django_content_type"."id" = "auth_permission"."content_type_id") WHERE ("auth_permission"."codename" = 'can_access_car' AND "django_content_type"."app_label" = 'cars' )

0.042 : SELECT "django_content_type"."id", "django_content_type"."name", "django_content_type"."app_label", "django_content_type"."model" FROM "django_content_type" INNER JOIN "auth_permission" ON ("django_content_type"."id" = "auth_permission"."content_type_id") WHERE ("auth_permission"."codename" = 'can_see_history' AND "django_content_type"."app_label" = 'cars' )

0.053 : SELECT bunch for fields from my app....

If django is like that: cars = get_objects_for_user(user, ('cars.can_access_car'), klass = Car).order_by('reg_no')

Then queries are like that: 0.042 : SELECT "django_content_type"."id", "django_content_type"."name", "django_content_type"."app_label", "django_content_type"."model" FROM "django_content_type" INNER JOIN "auth_permission" ON ("django_content_type"."id" = "auth_permission"."content_type_id") WHERE ("auth_permission"."codename" = 'can_access_car' AND "django_content_type"."app_label" = 'cars' )

0.053 : SELECT bunch for fields from my app....

Django 1.4.2 and Guardian 1.1.1

lukaszb commented 10 years ago

So, to sum up, what you'd expect is to have always single query regardless of number of perms codenames passes, right?

Will look into that, thanks for this.

OdifYltsaeb commented 10 years ago

Yup well it could make one query for all codenames not separate ones.

cancan101 commented 9 years ago

Where do we stand with this? I am seeing what I think are truly crazy queries being generated.

First this query runs and gets a list of all the rows a user is permissioned for:

SELECT "App_reportgroupobjectpermission"."content_object_id",
"auth_permission"."codename" FROM "App_reportgroupobjectpermission" 
INNER JOIN "auth_group" ON ( "App_reportgroupobjectpermission"."group_id" = "auth_group"."id" ) 
INNER JOIN "auth_user_groups" ON ( "auth_group"."id" = "auth_user_groups"."group_id" ) 
INNER JOIN "auth_permission" ON ( "App_reportgroupobjectpermission"."permission_id" = "auth_permission"."id" ) 
WHERE ("auth_user_groups"."user_id" = %s 
AND "auth_permission"."codename" IN (%s) AND "auth_permission"."content_type_id" = %s )

with PARAMS = (u'1', u"'view_report'", u'12')

and then this runs:

SELECT "App_report"."id", "App_report"."exam_id", "App_report"."uuid",
"App_report"."created_dt", "App_report"."created_by_id" FROM "App_report" 
INNER JOIN "App_exam" ON ( "App_report"."exam_id" = "App_exam"."id" ) 
WHERE ("App_exam"."uuid" = %s AND "App_report"."id" IN (%s, %s, %s, %s, %s))

with PARAMS = (u"'uuid-1234'", u'4', u'5', u'6', u'7', u'8')

It seem that this should be representable as a join or at the very least a nested SELECT where the contents of the IN predicate in the second query is replaced with the first query it self.

i.e. this entire part should be representable as one query. This should be possible even with groups in use, at the very least if any_perm is true. See: #269.

It looks like the issue is being caused by calling list on the value_lists rather than using them directly in the filter: https://github.com/lukaszb/django-guardian/blob/dceb6beb5830746b624a7ac99d93949ff263fb98/guardian/shortcuts.py#L419.

@thedrow not sure if you still use this package, but I was wondering what you think of this issue.

TauPan commented 6 years ago

Thanks @cancan101 for the analysis. This is exactly what's biting me in my code at the moment.

In addition, if the klass argument is given, the shortcut still filters on all available object permissions for all objects with the IN query, which in my case are a million objects. I think it should be at least possible to filter the permission list by the ids of the queryset given in klass before materializing it.

Even generating the query with all the ids takes long (a couple of seconds) while running it takes so long that the apache timeouts...

MadWombat commented 6 years ago

The case where only one permission is supplied and klass=None or a model can be optimized into a single query. Something like

SELECT
    car.*
FROM guardian_userobjectpermission as op
JOIN auth_user as u ON (op.user_id = u.id)
JOIN auth_permission as p ON (op.permission_id = p.id)
JOIN django_content_type as ct ON (ct.id = op.content_type_id)
JOIN car_car as car ON (car.id = op.object_pk::integer)
WHERE u.id = user_id AND 
             p.codename = 'can_access'  AND 
             ct.model = 'car' AND 
             ct.app_label = 'car'

Multiple permissions can be accomodated in asingle query as well, I think, but the query would become a bit more complicated (with subqueries like the one above for each permission and an intersect between them, for example).

That said, I have no idea how to accomplish something like this strictly using ORM (actually, I don't think it is possible to generate the query above using the ORM). If the devs do not mind using raw SQL operations in their code, I could make a PR.