ministryofjustice / find-moj-data

Find MOJ data service • This repository is defined and managed in Terraform
https://find-moj-data.service.justice.gov.uk/
MIT License
5 stars 0 forks source link

Performance - optimise query for fetching domains #856

Open MatMoore opened 2 months ago

MatMoore commented 2 months ago

The following query is executed as part of rendering the home page or search pages:

query getDomains {
listDomains(
    input: {start: 0, count: 1000}
) {
  domains{
    urn
    properties{
      name
    }
    entities(
      input:{query:"*",start:0,count: 0, filters: [{field: "tags", values: ["urn:li:tag:dc_display_in_catalogue"]}]}
    ){
      total
    }
  }
}
}

This takes ~500ms on its own, which slows down page loads.

Probably due to the entities subquery triggering N+1 queries on the backend.

In cases where we aren't using the total, we could take out that subquery example to speed it up 5x.

In cases where we are using the total, the following approach (150ms) also works:

# First query fetches the full list of domains, including empty ones
{
  listDomains(input: { start: 0, count: 1000 }) {
    domains {
      urn
      properties {
        name
      }
    }
  }

  # Second query fetches the counts, for domains that have stuff in them
  aggregateAcrossEntities(
    input: {
      query: "*"
      facets: ["domains"]
      orFilters: [{ and: [{ field: "tags", values: ["urn:li:tag:dc_display_in_catalogue"] }] }]
    }
  ) {
    facets {
      field
      displayName
      aggregations {
        value
        count
      }
    }
  }
}
mitchdawson1982 commented 2 months ago

Can you add the performance improvement of the combined listDomains& aggregateAcrossEntities approach for totals?

MatMoore commented 2 months ago

Can you add the performance improvement of the combined listDomains& aggregateAcrossEntities approach for totals?

The combined one took 150ms when I tested it. Here is the previous ticket, for reference https://github.com/ministryofjustice/find-moj-data/issues/644