graphql-python / swapi-graphene

GraphQL Starwars API using Graphene and Django
http://swapi.graphene-python.org
174 stars 36 forks source link

Querying for allSpecies and homeworld executes 38 SQL queries #12

Open simonw opened 7 years ago

simonw commented 7 years ago

This query:

{
  allSpecies {
    edges {
      node {
        name
        homeworld {
          name
          gravity
        }
      }
    }
  }
}

Executes 38 sql queries, looking something like this:

(0.001) SELECT COUNT(*) AS "__count" FROM "starwars_species"; args=()
(0.000) SELECT "starwars_species"."id", "starwars_species"."created", 
"starwars_species"."edited", "starwars_species"."name", 
"starwars_species"."classification", "starwars_species"."designation", 
"starwars_species"."average_height", "starwars_species"."skin_colors", 
"starwars_species"."hair_colors", "starwars_species"."eye_colors", 
"starwars_species"."average_lifespan", "starwars_species"."homeworld_id", 
"starwars_species"."language" FROM "starwars_species" LIMIT 37; args=()

(0.000) SELECT "starwars_planet"."id", "starwars_planet"."created", 
"starwars_planet"."edited", "starwars_planet"."name", 
"starwars_planet"."rotation_period", "starwars_planet"."orbital_period", 
"starwars_planet"."diameter", "starwars_planet"."climate", 
"starwars_planet"."gravity", "starwars_planet"."terrain", 
"starwars_planet"."surface_water", "starwars_planet"."population" FROM 
"starwars_planet" WHERE "starwars_planet"."id" = 9; args=(9,)

(0.000) SELECT "starwars_planet"."id", "starwars_planet"."created", 
"starwars_planet"."edited", "starwars_planet"."name", 
"starwars_planet"."rotation_period", "starwars_planet"."orbital_period", 
"starwars_planet"."diameter", "starwars_planet"."climate", 
"starwars_planet"."gravity", "starwars_planet"."terrain", 
"starwars_planet"."surface_water", "starwars_planet"."population" FROM 
"starwars_planet" WHERE "starwars_planet"."id" = 14; args=(14,)

... repeat another 36 times

Full list here: https://gist.github.com/simonw/762a98de85a2f113d1d7ec6026f73ee4

Is it possible to detect this case and then use select_related() or prefetch_related() here to join against the relevant table and get everything done in just one or two queries?

If so, an example showing how to do that would be a fantastic illustration of a slightly more complex use-case for graphene-django (I tried myself and couldn't figure out how to do it).

simonw commented 7 years ago

I generated the list of SQL queries by adding the following configuration to swapi_graphene/settings.py

LOGGING = {
    'version': 1,
    'handlers': {
        'console': {
            'level': 'DEBUG',
            'class': 'logging.StreamHandler',
        },
    },
    'loggers': {
        'django.db.backends': {
            'level': 'DEBUG',
            'handlers': ['console'],
        },
    },
}
simonw commented 7 years ago

Even better: is it possible to detect the fields being requested and update the SELECT fields in those statements, using the Django .only() QuerySet method?

My ideal SQL queries for this GraphQL query would be either this (using prefetch_related):

SELECT COUNT(*) AS "__count" FROM "starwars_species";

SELECT "starwars_species"."id", "starwars_species"."name", "starwars_species"."homeworld_id",
FROM "starwars_species" LIMIT 37;

SELECT "starwars_planet"."name", "starwars_planet"."gravity"
FROM "starwars_planet"
WHERE "starwars_planet"."id" in (
  9, 14, 23, 24, 28, 29, 31, 7, 33, 18, 8, 34,
  35, 37, 38, 39, 40, 41, 43, 44, 45, 46, 47,
  48, 49, 50, 11, 51, 54, 55, 10, 56, 57, 58, 59, 12
);

Or this (using a select_related() join):

SELECT COUNT(*) AS "__count" FROM "starwars_species";

SELECT "starwars_species"."id", "starwars_species"."name", "starwars_species"."homeworld_id",
"starwars_planet"."name", "starwars_planet"."gravity"
FROM "starwars_species"
JOIN "starwars_planet" ON 
  "starwars_species"."homeworld_id" = "starwars_planet"."id"
LIMIT 37;
simonw commented 7 years ago

On closer inspection of the docs, it looks like the solution here is to use a DataLoader - any chance of some example code for that? http://docs.graphene-python.org/en/latest/execution/dataloader/

simonw commented 7 years ago

I got this working using a dataloader: https://github.com/simonw/swapi-graphene/commit/cea04512abf01e929d50f050e3c690a2a7117565

Corresponding SQL:

(0.001) SELECT COUNT(*) AS "__count" FROM "starwars_species"; args=()
(0.001) SELECT "starwars_species"."id", "starwars_species"."created", 
"starwars_species"."edited", "starwars_species"."name", 
"starwars_species"."classification", "starwars_species"."designation", 
"starwars_species"."average_height", "starwars_species"."skin_colors", 
"starwars_species"."hair_colors", "starwars_species"."eye_colors", 
"starwars_species"."average_lifespan", "starwars_species"."homeworld_id", 
"starwars_species"."language" FROM "starwars_species" LIMIT 37; args=()
(0.002) SELECT "starwars_planet"."id", "starwars_planet"."created", 
"starwars_planet"."edited", "starwars_planet"."name", 
"starwars_planet"."rotation_period", "starwars_planet"."orbital_period", 
"starwars_planet"."diameter", "starwars_planet"."climate", 
"starwars_planet"."gravity", "starwars_planet"."terrain", 
"starwars_planet"."surface_water", "starwars_planet"."population" FROM 
"starwars_planet" WHERE "starwars_planet"."id" IN (9, 14, 23, 24, 28, 29,
31, 7, 33, 18, 8, 34, 35, 37, 38, 39, 40, 41, 43, 44, 45, 46, 47, 48, 49, 50,
11, 51, 54, 55, 10, 56, 57, 58, 59, 12); args=(9, 14, 23, 24, 28, 29, 31, 7,
33, 18, 8, 34, 35, 37, 38, 39, 40, 41, 43, 44, 45, 46, 47, 48, 49, 50, 11,
51, 54, 55, 10, 56, 57, 58, 59, 12)
simonw commented 7 years ago

Oops, correct resolver on the Specie class looks like this:

    def resolve_homeworld(self, *args):
        if self.homeworld_id:
            return planet_loader.load(self.homeworld_id)
        else:
            return None
simonw commented 7 years ago

I think that's wrong too. The PlanetLoader batch_load_fn(keys) method needs to return objects for the keys in the same order that the keys were provided.

simonw commented 7 years ago

OK, I think this is the correct implementation of the DataLoader:

from promise import Promise
from promise.dataloader import DataLoader

class PlanetLoader(DataLoader):
    def batch_load_fn(self, keys):
        objects_by_key = {
            o.pk: o
            for o in models.Planet.objects.filter(pk__in=keys)
        }
        results = []
        for key in keys:
            results.append(objects_by_key.get(key, None))
        return Promise.resolve(results)

planet_loader = PlanetLoader()