openplans / openblock

OpenBlock is a web application and RESTful service that allows users to browse and search their local area for "hyper-local news
61 stars 26 forks source link

Place overview lookup charts: SchemaFieldQuery.top_lookups raises database error #140

Closed slinkp closed 11 years ago

slinkp commented 11 years ago

Example: http://demo.openblockproject.org/api/place-lookup-chart/?sf=7&pid=l:50


Traceback:
File "/var/www/demo.openblockproject.org/current/lib/python2.6/site-packages/django/core/handlers/base.py" in get_response
  100.                     response = callback(request, *callback_args, **callback_kwargs)
File "/var/www/demo.openblockproject.org/builds/20110223/src/openblock/ebpub/ebpub/db/views.py" in ajax_place_lookup_chart
  334.     top_values = qs.top_lookups(sf, 10)
File "/var/www/demo.openblockproject.org/builds/20110223/src/openblock/ebpub/ebpub/db/models.py" in top_lookups
  479.         ids_and_counts = [(v['lookup_id'], v['item_count']) for v in qs.values('lookup_id', 'item_count').order_by('-item_count') if v['item_count']][:count]
File "/var/www/demo.openblockproject.org/current/lib/python2.6/site-packages/django/db/models/query.py" in _result_iter
  105.                 self._fill_cache()
File "/var/www/demo.openblockproject.org/current/lib/python2.6/site-packages/django/db/models/query.py" in _fill_cache
  783.                     self._result_cache.append(self._iter.next())
File "/var/www/demo.openblockproject.org/current/lib/python2.6/site-packages/django/db/models/query.py" in iterator
  852.         for row in self.query.get_compiler(self.db).results_iter():
File "/var/www/demo.openblockproject.org/current/lib/python2.6/site-packages/django/db/models/sql/compiler.py" in results_iter
  677.         for rows in self.execute_sql(MULTI):
File "/var/www/demo.openblockproject.org/current/lib/python2.6/site-packages/django/db/models/sql/compiler.py" in execute_sql
  732.         cursor.execute(sql, params)
File "/var/www/demo.openblockproject.org/current/lib/python2.6/site-packages/django/db/backends/util.py" in execute
  15.             return self.cursor.execute(sql, params)
File "/var/www/demo.openblockproject.org/current/lib/python2.6/site-packages/django/db/backends/postgresql_psycopg2/base.py" in execute
  44.             return self.cursor.execute(query, args)

Exception Type: DatabaseError at /api/place-lookup-chart/
Exception Value: more than one row returned by a subquery used as an expression

This comment is pretty clearly relevant, I suspect that something about upgrading to Django 1.2 changed such that the .values() hack no longer has the intended effect:

            # We want to count the current queryset and get a single
            # row for injecting into the subsequent Lookup query, but
            # we don't want Django's aggregation support to
            # automatically group by fields that aren't relevant and
            # would cause multiple rows as a result. So we call
            # `values()' on a field that we're already filtering by,
            # in this case, schema, as essentially a harmless identify
            # function.
            clone = clone.values('schema').annotate(count=Count('schema'))
            qs = Lookup.objects.filter(schema_field__id=schema_field.id)
            qs = qs.extra(select={'lookup_id': 'id', 'item_count': clone.values('count').query})

For reference, the SQL query generated by all this looks like (formatted for "readability"):

SELECT (id) AS "lookup_id",
       (
        SELECT COUNT("db_newsitem"."schema_id") AS "count"
        FROM "db_newsitem" INNER JOIN "db_newsitemlocation"
        ON ("db_newsitem"."id" = "db_newsitemlocation"."news_item_id")
        , "db_attribute"
        WHERE
        (
            "db_newsitem"."schema_id" = 4
                AND "db_newsitemlocation"."location_id" = 15
                AND db_newsitem.id = db_attribute.news_item_id
                AND "db_newsitem"."schema_id" = 4
                AND db_attribute.varchar02 ~ ('[[:<:]]' || db_lookup.id || '[[:>:]]')
            )
        GROUP BY "db_newsitem"."schema_id", "db_newsitem"."title"
        ORDER BY "db_newsitem"."title" ASC
    ) AS "item_count"
FROM "db_lookup"
WHERE "db_lookup"."schema_field_id" = 7
ORDER BY "item_count" DESC

The error happens when there's more than one row in the (...) AS "item_count" subquery.

Trying to see if I can get the tests to reproduce this.

slinkp commented 11 years ago

(In [9a402a11a3cf823d5a2a18ac6e6c673dcf5f417e]) Merged: Fix #146: breakage was introduced in changeset a11a8b00 when we added a default ORDER BY to NewsItemQuerySet. This had side effect of adding a spurious GROUP BY in the top_lookups query, which caused multiple rows in a subquery, which is an error.

slinkp commented 11 years ago

Ticket imported from Trac: http://developer.openblockproject.org/ticket/146 Reported by: slinkp