open-contracting / credere-backend

A tool that facilitates the participation of Micro, Small, and Medium businesses (MSMEs) in the Colombian public procurement market.
https://credere.readthedocs.io
BSD 3-Clause "New" or "Revised" License
5 stars 0 forks source link

Slow DB Query: submitted_count_msme_woman_unique #308

Closed sentry-io[bot] closed 1 month ago

sentry-io[bot] commented 1 month ago

Sentry Issue: CREDERE-BACKEND-PRODUCTION-26

The query:

SELECT COUNT(*) AS count_1
FROM (
  SELECT borrower.id AS borrower_id
  FROM borrower JOIN award ON award.borrower_id = borrower.id JOIN application ON
    application.borrower_id = borrower.id
  WHERE application.borrower_submitted_at IS NOT NULL AND borrower.size != %(size_1)s AND (
    award.source_data_contracts ->> %(source_data_contracts_1)s) IN (
    %(param_1_1)s, %(param_1_2)s
  )
  GROUP BY borrower.id
) AS anon_1

Looks like it matches:

    submitted_count_msme_woman_unique = (
        session.query(Borrower.id)
        .join(Award, Award.borrower_id == Borrower.id)
        .join(Application, Application.borrower_id == Borrower.id)
        .filter(Application.borrower_submitted_at.isnot(None))
        .filter(Borrower.size != BorrowerSize.BIG)
        .filter(Award.source_data_contracts["g_nero_representante_legal"].astext.in_(WOMAN_VALUES))
        .group_by(Borrower.id)
        .count()
    )

in get_borrower_opt_in_stats, which is called by:

jpmckinney commented 1 month ago

Since this slow query is exclusive to an OCP-only endpoint, I think it's okay to leave as-is.

sentry-io[bot] commented 1 month ago

Sentry Issue: CREDERE-BACKEND-PRODUCTION-25

Same here:

Query (difference is borrower_accepted_at instead of borrower_submitted_at):

SELECT COUNT(*) AS count_1
FROM (
  SELECT borrower.id AS borrower_id
  FROM borrower JOIN award ON award.borrower_id = borrower.id JOIN application ON
    application.borrower_id = borrower.id
  WHERE application.borrower_accepted_at IS NOT NULL AND borrower.is_msme = true AND (
    award.source_data_contracts ->> %(source_data_contracts_1)s) IN (
    %(param_1_1)s, %(param_1_2)s
  )
  GROUP BY borrower.id
) AS anon_1

Which matches:

    accepted_count_msme_woman_unique = (
        session.query(Borrower.id)
        .join(Award, Award.borrower_id == Borrower.id)
        .join(Application, Application.borrower_id == Borrower.id)
        .filter(Application.borrower_accepted_at.isnot(None))
        .filter(Borrower.is_msme == true())
        .filter(Award.source_data_contracts["g_nero_representante_legal"].astext.in_(WOMAN_VALUES))
        .group_by(Borrower.id)
        .count()
    )
jpmckinney commented 4 weeks ago

The JOINs are now through application.award_id instead of award.borrower_id #336, so this might be faster now.