graphql-python / graphene

GraphQL framework for Python
http://graphene-python.org/
MIT License
8.08k stars 823 forks source link

`first: 5` request pulling entire table twice, not using `limit` #108

Closed sgtsquiggs closed 8 years ago

sgtsquiggs commented 8 years ago

I've got a table with 92000 entries, and a lot of fields.

In this example, I want the pk (id in the table; it is mapped to pk in the Artwork model), and the title

{ allArtworks(first: 5) { edges { cursor, node { pk title } } } }

The entire table is pulled twice (10903.39ms or so) for unknown reasons, resulting in a total processing time of 72150.965ms

I would expect it to use limit in the sql query, but that isn't the case. It's basically doing a select (every field) from artworks

The schema:

class Query(ObjectType):
    all_artworks = DjangoFilterConnectionField(Artwork)

class Artwork(DjangoNode):
    pk = graphene.Int()

    class Meta:
        model = models.Artwork
        filter_fields = ['title'] #adding `pk` here causes an exception because it isn't a real db field i assume

    connection_type = Connection

    def resolve_pk(self, args, info):
        return self.instance.pk
syrusakbary commented 8 years ago

@sgtsquiggs for doing the pagination it have to know how many elements it have. So its doing a count on the query SELECT count(id) from artworks where ....

Can you install the Django debug plugin and see whats the response for that?

How to install Django Debug Plugin?

from graphene.contrib.django.debug import DjangoDebugPlugin

# ...
schema = graphene.Schema(query=Query, plugins=[DjangoDebugPlugin()])

And then query it like:

{
  YOUR_QUERY_FIELDS,
  __debug {
    sql {
      rawSql
    }
  }
}

Please post here what queries are happening so I can fix it better :)

sgtsquiggs commented 8 years ago

Doesn't seem to be working. Tried it on a simpler query:

{
  artwork(slug:"97397-bird-prince") { pk title }
  __debug {
    sql {
      rawSql
    }
  }
}
{
  "data": {
    "artwork": {
      "pk": 97397,
      "title": "Bird Prince"
    },
    "__debug": {
      "sql": []
    }
  }
}
syrusakbary commented 8 years ago

Which Graphene and Django version are you using? Are you hitting the database in the resolve_article function?

sgtsquiggs commented 8 years ago

Django 1.6.11

def resolve_artwork(self, args, info):
        pk_or_slug = first_not_none(args.get('slug', None), args.get('pk', None))
        return models.Artwork.get_by_pk_or_slug(pk_or_slug)

and finally

    @classmethod
    def get_by_pk_or_slug(cls, value):
        try:
            return cls.objects.get(Q(status='public'), Q(slug=value) | Q(pk__iexact=value))
        except cls.DoesNotExist:
            return None
syrusakbary commented 8 years ago

I think models.Artwork.get_by_pk_or_slug(pk_or_slug) is somehow not hitting the DB. Can you try return models.Artwork.first()?

sgtsquiggs commented 8 years ago
{
    allArtworks(first: 5) {
        edges {
            cursor,
            node { pk title }
        }
    }
__debug {
    sql {
      rawSql
    }
  }
}
{
  "data": {
    "allArtworks": {
      "edges": [
        {
          "cursor": "YXJyYXljb25uZWN0aW9uOjA=",
          "node": {
            "pk": 7,
            "title": "TEST ARTWORK A"
          }
        },
        {
          "cursor": "YXJyYXljb25uZWN0aW9uOjE=",
          "node": {
            "pk": 9,
            "title": "yet another artwork"
          }
        },
        {
          "cursor": "YXJyYXljb25uZWN0aW9uOjI=",
          "node": {
            "pk": 39,
            "title": "Moonwalk"
          }
        },
        {
          "cursor": "YXJyYXljb25uZWN0aW9uOjM=",
          "node": {
            "pk": 40,
            "title": "Sky High (Things are Getting Better)"
          }
        },
        {
          "cursor": "YXJyYXljb25uZWN0aW9uOjQ=",
          "node": {
            "pk": 41,
            "title": "Missed Mark"
          }
        }
      ]
    },
    "__debug": {
      "sql": []
    }
  }
}
sgtsquiggs commented 8 years ago

return models.Artwork.objects.first() also returned "sql": []

syrusakbary commented 8 years ago

Weird, is tested here, for Django versions 1.6, 1.7, 1.8 and 1.9 and the tests are passing.

Are you already using django-debug-toolbar in your Django project?

sgtsquiggs commented 8 years ago

Ok, if I disable django-debug-toolbar, DjangoDebugPlugin works:

{
  "data": {
    "allArtworks": {
      "edges": [
        {
          "cursor": "YXJyYXljb25uZWN0aW9uOjA=",
          "node": {
            "pk": 7,
            "title": "TEST ARTWORK A"
          }
        },
        {
          "cursor": "YXJyYXljb25uZWN0aW9uOjE=",
          "node": {
            "pk": 9,
            "title": "yet another artwork"
          }
        },
        {
          "cursor": "YXJyYXljb25uZWN0aW9uOjI=",
          "node": {
            "pk": 39,
            "title": "Moonwalk"
          }
        },
        {
          "cursor": "YXJyYXljb25uZWN0aW9uOjM=",
          "node": {
            "pk": 40,
            "title": "Sky High (Things are Getting Better)"
          }
        },
        {
          "cursor": "YXJyYXljb25uZWN0aW9uOjQ=",
          "node": {
            "pk": 41,
            "title": "Missed Mark"
          }
        }
      ]
    },
    "__debug": {
      "sql": [
        {
          "rawSql": "SELECT `artwork`.`slug`, `artwork`.`id`, `artwork`.`title`, `artwork`.`description`, `artwork`.`literature`, `artwork`.`special_terms`, <<SNIP 40 MORE FIELDS>> `artwork`.`end_date`, `artwork`.`suppress_display`, `artwork`.`shipping_category_id` FROM `artwork`"
        }
      ]
    }
  }
}

This took 36.15 seconds

syrusakbary commented 8 years ago

Ok, I will fix that now, so you will be able to use django-debug-toolbar and DjangoDebugPlugin at the same time.

syrusakbary commented 8 years ago

Ummmm... not limit is applied. Can you try with DjangoConnectionField and output what happens?

class Query(ObjectType):
    all_artworks = DjangoConnectionField(Artwork)
sgtsquiggs commented 8 years ago

That seems to be it.

... FROMartworkLIMIT 5 OFFSET 92034" when using DjangoConnectionField

syrusakbary commented 8 years ago

Great! I think I know where is the issue. Will push a solution around today.

syrusakbary commented 8 years ago

Hi @sgtsquiggs, this is now fixed and tested.

Install the latest graphene version pip install graphene==0.7.2. Everything should be running as expected.

Please reopen this issue if not.

ubill commented 6 years ago

DjangoFilterConnectionField doesn't seem to be respecting the max_limit and pulling all (6000+) rows from my table.

Here's how I defined it in my code:

devices = DjangoFilterConnectionField(DeviceNode)

I saw that there is a constant in graphene_django/settings.py

# Copied shamelessly from Django REST Framework

DEFAULTS = {
'RELAY_CONNECTION_MAX_LIMIT': 100,

but it 100 doesn't seem to be used.

Any ideas? @syrusakbary @sgtsquiggs Thanks. :)